Engineering for Scale, Performance and Stability
Why Performance and Stability Are So Important?
Consumers today have high expectations for the user experience when buying online and have little patience for poor performance and downtime. Slow-responding sites pay a price in lost sales and a tarnished brand image. Downtime at the peak demand time, like Black Friday or Cyber Monday, can ruin a company’s financial performance for the year and leave lasting scars for years to come.
Why Large-Scale Sites Struggle with Performance and Stability?
Large-scale eCommerce sites are complex, distributed systems with many components, tiers and connection points, each of which can become a weak link. Even a small and temporary degradation in performance of a single element of a distributed fabric – from poorly designed API to hardware malfunction – may lead to disastrous consequences. For example, the granularity of the parameters of an API interface, combined with a caching configuration of the client tier, may be the difference between a great performance or a global site outage for a certain sequence of API calls.
To make matters worse, the problems tend to show themselves in the most critical situations, under heavy peak loads or otherwise unusual execution patterns. Ironically, the greatest success story of the marketing department may lead to the darkest moment for the corporation.
Performance and stability issues are difficult to diagnose and even harder to fix. Even when you have an exceptionally strong technical team experienced in the design of similar high-performance eCommerce systems and broad understanding of new technologies and strong performance engineering and testing practices. It is important to bring in an expert who has been working with similar problems in other companies and can look at the problem from a different angle. That’s why Grid Dynamics is here.
How can Grid Dynamics Help?
We are site performance experts. We analyze your system, discover bottlenecks and develop a get-well plan. We will identify incremental improvements, such as picking better libraries, tuning garbage collectors or improving the deployment configuration. We will also suggest architectural improvements, like adding a distributed caching or switching to a different messaging framework.
Sometimes, we will point out structural problems that cannot be fixed effectively without switching to more scalable and performant architecture. Usually, this will be rather obvious and not a big surprise. Our architects and business analysts will work with your technical and business team to sort out priorities, validate proposed direction, develop upgrade plan and implement improvements. More specifically, we may look at
- How your application is split between web apps and the services layer
- How each tier deploys, scales and recovers from failures
- What middleware products are used for each tier, and in between
- The type – granularity and invocation pattern of APIs between the layers
- Data volumes, types, flows, storage, aggregation, access, caching and distribution patterns
- Use of CDNs for static and dynamic content
- Architecture and the use of local and enterprise caching fabrics
- Architecture and the use of messaging, ESB and ETL
- Architecture and the use of data stores and file systems
- Architecture and the use of virtualization, public or private clouds
- And much more
Engineering Discipline and Tools Matter
Beyond the software itself, we’ll also look at how your team performs software development, particularly around continuous performance testing and performance engineering tools. We can recommend and implement enhancements to release management processes, along with underlying infrastructure, to make performance, stability and scalability testing an on-going part of continuous software delivery. To help integrate performance testing into your company’s continuous delivery process, we developed Jagger, an open source tool for continuous performance testing. You can learn more about Jagger here.
Where Did Our Expertise Come From?
Design and implementation of the large-scale mission-critical systems has always been the primary focus of Grid Dynamics. Our architects and engineers design, develop, deploy, tune and support large-scale systems on daily basis.
Our customers include some of the largest online sites like eBay and PayPal and lead technology providers like Microsoft, Cisco and VMWare. Also, specialized online technology services companies like RingCentral, RedAril and RivalWatch.
We constantly maintain and expand the list of core enterprise technologies we know down to the internals and use to enable scalability and performance. The most popular include:
- Caching: Oracle Coherence, VMWare Gemfire, GigaSpaces, Terracotta, GridGain, AppFabric
- Big Data: Hadoop, hbase, HDFS, Pig, Hive, EMC Greenplum
- Messaging: RabbitMQ, ZeroMQ
- Search: Solr, Endeca, FredHopper
- Monitoring: Nagios, OpsView, Collectd
- Public clouds: Amazon Web Service, Rackspace, Azure, GoGrid
- Private clouds: OpenStack, VMWare
- PaaS: Google AppEngine, Heroku, Cloud Foundry
- Cloud Management: RightScale, enStratus
- Automation tools: Chef
We also have a growing number of technology partners with whom we work closely to develop customer-centric solutions and bring them to the market. These are some of our partners: Cloudera, GridGain, Terracotta, GigaSpaces, GoGrid, Rackspace, Lucid Imagination, VMWare, Cisco, Microsoft, Oracle.
If You Had Site Performance or Stability Issues Last Holidays Season, You Just Have to Talk to Us!
We are the only independent company of its kind, focused on scalability and performance of large-scale eCommerce platforms and highly regarded in the industry for our deep expertise and engineering excellence. Our engagement model is so simple that it makes even the initial exploration valuable for you. Contact us today for more information.