What We Do / Case Study

HealthyToys.org Web Site

The below case study is intended as an historical document, describing the challenges posed to Mouko by one of its clients, The Ecology Center, and the steps taken to overcome those challenges. This is meant to serve as an example of the quality of the results and overall experience a client can expect when working with Mouko The solutions described below may not fit your particular scenario. Please contact Mouko if you have questions, or to arrange a custom solution that fits your needs exactly. This document was produced and published with permission from the Ecology Center.

Executive Summary

On December 5, 2007, a new web site was launched called HealthyToys.org (henceforth referred to simply as “the site”). The aim of the site was to educate parents about the toxic content (e.g. lead, mercury, PVC) in children’s products, toys in particular. The site was a project of the Ecology Center, an Ann Arbor, Michigan not-for-profit company that hired Mouko, a web development company also based in Ann Arbor, to create the site. The project was conceived-of in September, 2007 and development began on the site in October. Under strict embargo, the site was released to the press one week before public release.

The press lauded the site as extremely easy-to-use, useful, and comprehensive, and it was covered by most of the major news networks and papers, including national coverage on CNN and Fox News. Consequently, the site received millions of accesses on its first day and had trouble keeping up. Due largely to the efforts of Mouko, HealthyToys.org was operating fairly well by the next day, although site traffic continued at the same rate or higher. The site was considered a great success by both the Ecology Center and Mouko. Within 20 days of the site’s release, Michigan Governor Jennifer Granholm signed a law limiting the amount of lead in children’s products [1]. The following document discusses the project’s challenges, successes, and failures in detail.

Introduction and Background

Dramatis Personæ

Mouko

Founded by Alexander Ade and Justin Laby, Mouko, LLC (Mouko) is a privately-owned software and web development company based in Ann Arbor, MI. Mouko has successfully served both the corporate and not-for-profit sectors since the year 2000. It began working with the Ann Arbor based Ecology Center in Spring 2007.

The Ecology Center

A membership-based, not-for-profit environmental organization based in Ann Arbor, Michigan, The Ecology Center (“the EcoCenter”) was founded by community activists after the United States’ first Earth Day in 1970. The EcoCenter is now a regional leader that works for a safe and healthy environment where people live, work, and play. Numerous web sites are owned by the Ecology Center, including ecocenter.org, cleancarcampaign.org, stoptrash.org, leadfreewheels.org, healthycar.org, and others. Mouko was first hired by the EcoCenter in Spring 2007 to complete the implementation of a site redesign that had been begun by another developer.

The Web Site

In August, 2007, the Ecology Center approached Mouko about creating a new interactive web site that would help to inform citizens of potentially hazardous chemical content in toys and other children’s products. “Chemicals of concern” included lead, arsenic, mercury, and antimony, among several others. The Ecology Center had already begun testing and gathering data on toys using a hand-held X-ray fluorescence (XRF) device before the first meeting with Mouko. By site launch on December 5, 2007, the number of toys tested had grown to over 1,200 and the Ecology Center had partnered with numerous other not-for-profit groups across the country, some of which tested via a “digestion” method, which was capable of quantifying lead, but not other chemicals. In anticipation of the new site, the Ecology Center had already purchased the domain name, healthytoys.org. The site launched as scheduled and received immediate acclaim from news agencies across the country. It has been a significant source of revenue and a strong advertising vehicle for the Ecology Center.

In addition to showing chemical test results for the various components of each product (e.g. a green dinosaur with yellow spots was tested at both yellow and green areas [2]), a textual “level of detection” (high, medium, low) was generated for each chemical of concern, as well as an “overall level” for the product as a whole. Site visitors were also engaged in the testing process: they were able to nominate their own toys to be tested and then vote from among nominated toys those that the Ecology Center would test next; they could find products by type or brand; they could perform a free-form search that matched on name, type, manufacturer, retailer, and product code; they could add specific products to a ”list” of products they had noted to be printed later; and could browse a list of only the best and worst toys. The layout of the site was based largely on the Ecology Center’s HealthyCar.org site, which had been developed by another company.

Challenges

The HealthyToys.org project presented a number of new and different technical and social challenges for Mouko. Although the site designers had some initial direction, HealthyToys was developed “on the fly” to a great extent, with specifications changing as the site was developed. With just over three months to test over 1,200 toys, design a feature-rich, fully-functional site, and prepare for national release on a tight budget, creative solutions were necessary. The following were some of the major hurdles encountered.

Short Time and Small Budget

The EcoCenter indicated intitially that they would like Mouko to use HealthyCar.org, a site they had previously paid a web developer to create, as a template for the new site. Mouko had given this as an option, believing it would allow them to recycle existing PHP code and design templates, thereby saving the EcoCenter time and money. At first, the EcoCenter had specified that the site was to be ready by the end of October, ahead of the holiday shopping season. The decision to use HealthyCar.org’s design and interface would save precious time, but many vital questions still needed answers and there was not much time to get them.

Question: HealthyCar.org’s products were grouped hierarchically by year, make, model, and class, e.g. 2006, Audi, A4, and “Upscale Sedan,” respectively. Would this organizational method map easily to HealthyToys.org, and was this hierarchy even appropriate for a toy-oriented site?

Answer: No and No. Indeed, toys are not organized the same way as cars, nor do people tend to think of children’s products in terms of a “make,” “model,” or “model year” as they do with automobiles. There are also many more manufacturers of children’s products than of vehicles, so it would be impractical if not impossible to produce a concise list of all manufacturers on a single page, as had been the case with HealthyCar.org. Unlike cars, toys do not necessarily fall into a single “class” of product; while a vehicle cannot be both a “sedan” and a “pickup truck,” a children’s product might very well fit into multiple industry classes. For example, the EvenFlo “Switch-a-roos Crinkle Bee” is a “High-seat-attaching,” “Plush,” “Rattling” product. Mouko would need to change fundamentally the way the data were organized while simultaneously allowing for multiple “classes” of product for each toy. In order to make data access and storage efficient, this would mean redesigning the database model completely. The HealthyCar.org data model would not suffice.

Comparison of HealthyCar.org and HealthyToys.org.

Figure 1: Partial screen shots from HealthyCar.org and HealthyToys.org. A car web site is not the best model for a toy site; people think of cars and toys in different ways.

Question: The primary chemicals of interest on HealthyCar.org were lead, bromine, chlorine, and “other chemicals.” Were these the same “chemicals of concern” for toys, or were there more or fewer?

Answer: There were more. The “chemicals of concern” ultimately chosen were antimony, arsenic, bromine, cadmium, chlorine (polyvinyl chloride), chromium, lead, mercury, and tin. It was decided only to include arsenic, cadmium, chlorine, lead, and mercury on product details pages, but the database would still need to account for levels of antimony, bromine, chromium, and tin. The latter chemicals would appear on a page listing “complete chemical results” in parts per million, along with a percentile rating for each element, for each product component tested. As new data were added to the database, the percentiles would need to change accordingly. Also, the XRF device captured information for dozens of other elements, such as iron, copper, and even protactinium. If it were later determined that other chemicals needed to be added to the list, it would be in the Ecology Center’s interest to have this information stored and readily available, even if it were not displayed at the time of the site’s initial release.

Question: The car site used a numeric scoring system that yielded a decimal value between 0.0 (best) and 5.0 (worst). EcoCenter staff were concerned that this scoring method seemed too scientific and would be off-putting to site visitors. The designer had pre-calculated this value and stored it in the database using a weighting system developed for the chemicals of concern in cars. What mechanism could replace this numeric scoring mechanism, could it be replaced easily, and could the score be calculated in real time in case the formula were to change?

Answer: The numeric score was replaced with three values: HIGH, MEDIUM, and LOW, with different formulas calculating the respective value for each chemical of concern. For example, lead was of high concern at 600 parts per million, while arsenic received a score of HIGH at 100 parts per million. It would be possible to calculate items’ scores in real time, but a balance would need to be found between taxing the database, the web server, and the web browser of the site visitor.

A Kitchen Full of Cooks

By the time the project was complete, more than twelve organizations were involved in the HealthyToys project, representing ten U.S. states. Many of the organizations were participating in toy testing, but using different devices and methods to generate data. The collaborative nature of the HealthyToys project created two significant challenges:

  1. Ensure data were “clean,” stored efficiently, and displayed in a useful manner.

    The first test method involved an X-ray Fluorescence (XRF) device, which scanned the surface of an object and captured data in parts per million (PPM) for 37 elements, among which were the nine primary chemicals of concern. The second method, used by some groups, was “digestion," whereby products were dissolved in nitric acid. The digestion process only quantified data about the presence of lead.

    The idea to include digestion data came in mid-November, meaning Mouko had to design a method to account for situations where data were missing, and to represent the absent data in a clean, concise way.

    Data from all sources were consolidated by the EcoCenter and then sent to Mouko as an Excel® spreadsheet on a periodic basis. In order to ensure that the information entering the database would be organized in an efficient way, that the database would generate information useful to site visitors, and that the database would be flexible enough to permit future growth, Mouko would have to design the database carefully and develop a system to ensure that as little “garbage” as possible found its way in.

  2. Design the site such that features could be changed rapidly and with minimal work.

    Much of the site was conceived “on the fly.” New features appeared rapidly, and existing features either disappeared or changed significantly even up to the final moments before release. HealthyToys.org would be successful if and only if all these features were implemented, and Mouko would need to design its applications with pliancy in mind. It was clear early in the project that the easy addition, removal, and modification of features would be necessary. If, for instance, the algorithm used to calculate a “HIGH” score for arsenic were to change, it could not be disastrous for the site as a whole.

Slashdot Effect

“The Slashdot Effect,” also known as “The Digg Effect," an “Instalanche,” a “Flash Crowd,” and many other terms, describe when a web site suddenly gets a large volume of traffic due to a link from a popular web site or publication in a major news outlet. An underpowered web site can have trouble coping with the unprecedented traffic, and consequently visitors of the site may get “timeout” messages, may not provided the content they seek, and in some cases may lose interest in the site, never to return.

The HealthyToys.org site launched on December 5, 2007. The site was featured by Fox News, CNN, Reuters, CBS, The Wall Street Journal, The Washington Post, The San Francisco Chronicle, and according to EcoCenter staff, there were “over 200 TV/radio stories over the [first] 24 hours, [airing] in nearly every market in the US, both large and small.” In many cases, there were multiple stories by a given network throughout the day. Bill Tucker from CNN’s Lou Dobbs Tonight described the site as “popular,” and asked his viewers to “be patient” [4].

On its first day of existence, HealthyToys.org received more than 5.6 million “hits,” over 1.8 million page views and 100,000 unique visitors. The server hosting HealthyToys.org also hosted all of the EcoCenter’s other sites, including ecocenter.org, all of which slowed to a halt. Mouko staff had to think creatively to bring the EcoCenter’s site back to life, and received little assistance from the Internet Service Provider who hosted the EcoCenter’s sites.

Unresponsive Internet Service Provider

The Ecology Center had hired Houston, Texas based ISP to provide dedicated hosting to its ecocenter.org, HealthyCar.org, and other sites. In anticipation of additional traffic spawned by HealthyToys.org, EcoCenter staff informed their ISP that we would need additional hardware, including a RAM and processor upgrade, and a boost in concurrent connections. Although these changes were promised to be completed well before site launch, it did not happen. Site launch had to continue on the planned date, because the media were set to release stories on December 5. Further, when the site was “slashdotted” (see “Slashdot Effect,” above), the ISP was, once again, completely unresponsive. Mouko would need to come up with other solutions rapidly to overcome the sudden burst of traffic, particularly in light of the problems with the ISP.

Solutions and Successes

The following are some of the main areas where Mouko succeeded in the implementation of HealthyToys.org.

Flexibility Helps Resolve Time and Budget

When the Ecology Center initially proposed this new project and its short timeline, Mouko proposed three options:

  1. Low-cost, quick turn-around

    Mouko could replicate the HealthyCar.org site, including a duplication of the database, but changing all instances of “car” with “toy.” Although this would require the organizational scheme for cars to apply to toys as well, it would enable a very quick deployment of the new site. The features for HealthToys.org would be exactly the same as those from HealthyCar.org. Nothing would change except nominal graphics and text.

  2. Slightly greater cost, slower turn-around

    Mouko could replicate the design of HealthyCar.org, but would redesign the database to allow for future expansion. This would mean changing some of the HealthyCar.org code to interact properly with the new database model. Mouko would have to spend little time on graphic design, layout, menu systems, JavaScript programming, etc., as this would all be recycled from HealthyCar.org.

  3. Highest cost, slowest turn-around

    Mouko could create a new site from scratch, including a new overall look and feel for the site, database design, programming (JavaScript and server-side dynamic scripting, e.g. PHP). This option would take the most time, and Mouko could not guarantee that the site would be ready by the end of October, but it would have ensured the highest quality.

The Ecology Center initially chose option two: Mouko would redesign the database, but would recycle as much code and as many design elements as possible from the HealthyCar.org site. By giving the EcoCenter a flexible set of options, they were able to make informed and practical decisions. If Mouko had insisted on writing all their own code and designing the site from scratch, the cost to the EcoCenter would have been much higher, and it was much less likely that the site would have been delivered on time.

Ultimately, the EcoCenter wanted to deviate significantly from the features of HealthyCar.org. It became clear early-on that the structure and features of a site about cars could not map directly to a toy site, and that certain features would be highly desirable by site users, so some custom programming would be necessary. Consequently, the EcoCenter moved back the launch date of HealthyToys from 31 October to 5 December.

The decision to recycle code had down-sides, most notably the inheritance of bugs from the HealthyCar.org site. Mouko discovered late in the design phase that the JavaScript menuing system developed for HealthyCar did not function properly in some older web browsers. After discussing the problem with the Ecology Center, it was decided that the browser issues were less important than the creation of new HealthyToys features, and because time was running short, the bugs would stay and would be revisited at a later time.

Key Success

The site was released on-time, and was lauded by media. CNN referred to the site as easy-to-use, easy-to-navigate, and in particular, vastly superior to the product recall site of the Consumer Product Safety Commission.

A Well-planned Database Model Ensured Scalability

Although data arrived from the EcoCenter in an Excel spreadsheet, they were stored ultimately in a MySQL® relational database. Each product had many unique attributes, such as product and retailer codes, country of manufacture, name of manufacturer, name of retailer, and name of distributor. Products were classified into one or more “groups,” such as “Halloween Costumes” or “Bath Toys.” If a product had multiple components or distinct parts, each part might be tested. In multiple cases, as many as fifteen distinct components for a single product were scanned with the XRF device. Each component could potentially be tested in a different location and on different days, and the type of test done (e.g. XRF or digestion) would need to be captured as well.

Because many of the products tested were not tested with an XRF device, the database needed to accommodate the absence of data for non-lead tests. Anticipating that the data collected would change with time, Mouko designed the database with flexibility in mind.

Key Success

The net result was a database flexible enough to accommodate an infinite number of products with any number of components that had as many or as few tests performed as desired. Products could be assigned any number of categories. This meant efficient storage and retrieval of data, the ability to modify the web site quickly, and the ability to scale the volume of data captured boundlessly in many dimensions.

Flash Crowds were Less of a Problem with Planning

HealthyToys.org was developed using a dynamic scripting language, PHP, a MySQL database, and a Linux® workstation running the Apache® web server. This collection of technologies, sometimes referred to as LAMP (Linux, Apache, MySQL, and PHP), was chosen for HealthyToys.org primarily because the site upon which it was initially based, HealthyCar.org, had also been developed using LAMP.

While it is often convenient to use languages, like PHP, for dynamic web sites, they cause additional burden on a web server as compared to static (HTML) web pages. Static web pages are sent verbatim, while dynamic pages are calculated when a request is received. In other words, whereas a static page is always the same, a dynamic page could be different each time it is accessed, depending on the parameters sent. When this burden is multiplied by thousands of simultaneous page requests, an underpowered server can suffer dramatically.

Dynamic web sites typically pull their data from a “database server.” Like a web server, a database server will become overwhelmed when many requests (e.g. from PHP programs) occur simultaneously.

A web server is typically configured such that it can only serve a certain number of simultaneous connections. When the number of people requesting files from a web server exceeds the number it is configured to handle, a queue forms. If visitors wait in queue too long, their request times out and instead of the information they requested, they get nothing.

The EcoCenter’s web site existed on a single, dedicated computer that also housed all of the Ecology Center’s other sites. This lone machine was both the web server and the database server. When a visitor requested a web page with dynamic content, the PHP script would query the database, which would then process the request and respond back to the PHP program. The program would in turn process the information from the database, generate a web page, and send it back to the requestor. If a site receives a tremendous amount of traffic, it is clear to see that the server will very quickly be unable to function.

In light of the massive “buzz” that was sure to be generated by HealthyToys.org, Mouko took several steps to ease server load before the site’s release. Aside from revisiting code to ensure that PHP procedures functioned in efficient ways, and taking steps to ensure that the database operated in more streamlined ways, Mouko wrote a caching system that caused PHP pages to simulate static HTML. Once a database query was performed by a PHP script, the resultant HTML would be written to a file. Subsequent identical queries would read the static file, rather than performing the same search a second time or peforming the same PHP calculations. Additionally, Mouko took advantage of JavaScript, using the site visitor’s web browser to render some dynamic content. This gave pages the appearance of having been generated dynamically. Many site features relied heavily on JavaScript and web “cookies”; for example, cookies were used to save a product to the visitor’s list and JavaScript was used to determine if a product was already on their list. JavaScript is executed by a web browser and not a web server or database server, so this also helped to avoid load.

Despite these steps, the server still struggled under the weight of millions of requests. Although HealthyToys.org alone received millions of hits, all the other Ecology Center web sites received elevated traffic as well, further exacerbating server load (and suffering the same ill effects). It was obvious very early in the morning that the site had become nearly unusable. Many requests were timing out, and those that did not time-out took an exceedingly long time to finish. Steps needed to be taken.

HealthyCar.org received millions of hits on launch day, while
          in the 2nd day, there were more page views, but fewer hits. Another
          spike occurred on 19 Dec after a press release, and a final spike
          occured the day after Christmas.

Figure 2: HealthyToys.org traffic for December, 2007. By day 2, site was more accessible as shown by more page views and fewer hits.

On December 5, Ecology Center staff were busy responding to press, granting interviews, and otherwise dealing with the newfound fame their site had instantly garnered. Mouko staff set up shop at the EcoCenter offices and began working hurriedly to find the best ways to solve the launch day site freeze. The first step was to contact the Internet Service Provider. There was no method to contact the ISP’s tech support other than via a time-consuming, awkward, and frustrating web page. While awaiting any human acknowledgement from them (it took nearly an hour), Mouko worked to isolate the cause of the slow-down. The first problem was the Apache web server had been configured to allow only a small number of simultaneous connections. Eventually, the ISP responded to Mouko stating they would up the number of simultaneous connections. After this was done, the site slow-down continued.

Ultimately, in order to offset server load, two changes seemed most effective to remedy the problem:

  1. Several very high resolution images were removed from the site, freeing up connections for quick transactions instead of long, laborious ones.

  2. The product images for each product, as well as some site graphics, were moved to Mouko’s own corporate site. This reduced the bandwidth requirements on the main HealthyToys web server and decreased the number of requested connections.

Key Success

The PHP caching system developed for HealthyToys, in addition to efficient coding practices, decreased the impact and duration of the Slashdot Effect. Mouko’s dedication, response, and donation of their own site’s bandwidth helped the EcoCenter ride the storm of the site’s first critical day, despite complete neglect from their Internet Service Provider. Mouko donated the time spent on the day of the site’s launch to the Ecology Center, as well as its own bandwidth while images were being served from mouko.com.

Sophisticated Data Management Tools Overcame Data Inconsistency

Toy and test results data were provided to Mouko in Microsoft Excel spreadsheets containing 124 columns with fields describing the toy, its components, origin, identification numbers, testing methods, test results, etc. Since thousands of rows of data were organized, compiled, and classified by hand, the process was prone to error.

To populate the HealthyToys.org database, Mouko created software to pre-process, filter, and load the data. Error checking, error correction, and quality control were managed in a series of pre-processing and filtering steps, the results of which were output as MySQL loading scripts.

The processing pipeline read comma-separated value (CSV) data exported from Excel files; isolated and loaded core data including country of origin, manufacturer, and retailer; clustered toy components; formatted test results; grouped products by batch number; set nomination and voted status; mapped products to product types and images; and wrote a set of output loading scripts while constantly checking for mis-alignments, mis-classifications, out-of-range test results and dates, etc.

The few errors that made it through the system were corrected by hand. The software saved Ecology Center staff time by minimizing the need for human curation and minimizing time spent fielding and responding to error feedback from users.

Key Success

The data management processing pipeline provided “clean data” to the HealthyToys.org web site. The process provided site visitors with a greatly enhanced user experience and saved Ecology Center staff time by decreasing the need for human curation.

Conclusion

HealthyToys.org was a tremendous success, both for Mouko and the Ecology Center. With over 20,000 new additions to their mailing list in the first 24 hours and volumes of donations, this was by far the most successful single project the Ecology Center had done. From start to finish, the project took a very short time. The project started with a great idea, but its success was largely due to the tenacity of the Ecology Center in gathering partner organizations who could share the work load of testing products, but also because of the dedication, ingenuity, and rapid response of Mouko. The toy site was the first to garner national attention that Mouko had developed, and thus it will always be a crowning achievement for the small Ann Arbor company.

The project did not go perfectly, however, and Mouko will approach similar projects differently in the future in the following ways:

  • Clients will be encouraged to standardize on a data model early on. Mouko will assist in the development of Excel templates to ensure that products are classified consistently (e.g. to avoid classifying something as “Teethers and Rattles” when a “Rattles & Teethers” category already exists). This will also ensure that the same data are present regardless of who tested a product or which test was performed, and that data are ordered and labeled in a predictable way. If changes are made to an existing template, all who use the template will be informed and it will be placed in a jointly-accessible location.

  • Before suggesting recycling of another web developer’s code, Mouko will do thorough testing of existing code to ensure it is bug-free. Although recycling a previous site’s design and source code will tend to save time, it would be better to know ahead of time what bugs exist, so there are no surprises well into the project. Regardless of who introduced the bug initially, it will be the perception of many that it is the fault of the most recent web developer to fix it.

  • Mouko will work well ahead of schedule with any ISP to ensure that the system is well-prepared for a large launch day. Clients need to be well-apprised of the possibility of being Slashdotted early on for budgeting purposes, and that avoiding it may cost thousands of dollars for multiple colocated hosts and load balancing. Mouko will research quality internet service vendors ahead of time to ensure only time-tested companies are used.

  • Adequate time will be allocated before launch day for load testing.

  • A freeze will be placed on new features for a minimum of two weeks prior to site launch or media release, possibly longer for projects of greater complexity.

It was and is a genuine pleasure for Mouko to have had a role in the creation of HealthyToys.org. The project brought unique challenges and a genuine feeling that what was being done was helping to make the world a better place. There is no greater reward than that.

References

  1. Shaw, Elizabeth. 2007. “Legislation limits toxic lead in children's products” The Flint Journal.

  2. HealthyToys.org. 2007. Product Details (Squeezable Dinosaur).

  3. De La Cruz, Veronica. 2007. “Segment: Safe toys Web site” CNN (American Morning).

  4. Tucker, Bill. 2007. “Segment: Buyer Beware” CNN (Lou Dobbs Tonight).

  5. The Ecology Center. 2008. About the Ecology Center.

  6. HealthyToys.org. 2008. HealthyToys.org — About Us.

Acknowledgements

Mouko would like to thank the Ecology Center for their continued support, for permission to cite their case as an example, and most especially for the excellent work they continue to do.

Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.