Henke Manufacturing | Heavy

Huge Snow Plow


Requests Over TimeAs the Snowplow platform matures and is adopted more and more widely, understanding how Snowplow performs under various event scales and distributions becomes increasingly important.

Our new open-source Avalanche project is our attempt to create a standardized framework for testing Snowplow batch and real-time pipelines under various loads. It will hopefully also expand ours and the community’s knowledge on what configurations work best and to discover (and then remove!) limitations that we might come across.

At launch, Avalanche is wholly focused on load-testing of the Snowplow collector components. Over time we hope to extend this to: load-testing other Snowplow components (and indeed the end-to-end pipeline); automated auditing of test runs; extending Avalanche to test other event platforms.

In the rest of this post we will cover:

1. How to setup the environment

Avalanche comes pre-packaged as an AMI available directly from the Community AMIs section when launching a fresh EC2 instance. Simply search for snowplow-avalanche-0.1.0 to find the required AMI and then follow these setup instructions to get started.

Once the instance has been launched and you have SSH’ed onto the box you will need to setup your environment variables for the simulation:

  • SP_COLLECTOR_URL: your Snowplow Collector endpoint
  • SP_SIM_TIME: the total simulation time in minutes
  • SP_BASELINE_USERS: the base amount of users that are pinging the collector
  • SP_PEAK_USERS: the peak amount of users to load test up until

You can then go ahead and launch Gatling using either our launch script:

ubuntu$ ./snowplow/scripts/2_run.sh

Or you can launch it yourself:

Average Latencyubuntu$ /home/ubuntu/snowplow/gatling/gatling-charts-highcharts-bundle-2.2.1-SNAPSHOT/bin/gatling.sh -sf /home/ubuntu/snowplow/src

After which you can select the simulation you wish to run:

Choose a simulation number: [0] com.snowplowanalytics.avalanche.ExponentialPeak [1] com.snowplowanalytics.avalanche.LinearPeak

Or to directly launch the simulation without any interaction:

ubuntu$ /home/ubuntu/snowplow/gatling/gatling-charts-highcharts-bundle-2.2.1-SNAPSHOT/bin/gatling.sh -sf /home/ubuntu/snowplow/src -s com.snowplowanalytics.avalanche.ExponentialPeak

The above can be useful if you wish to run Avalanche across many EC2 instances at the same time and would like to supply the launch command within the User-Data section in place of having to SSH onto the instance.

For very high throughputs, you will need to contact Amazon Technical Support to have them pre-warm your Load Balancer to be able to handle the throughput being generated by Gatling.

Note: in using Gatling we comfortably managed 825, 000 requests per minute from a single c4.8xlarge instance. For much more than this we recommend moving to running Avalanche from multiple instances.

2. How to access results

Gatling generates results as a simple webpage. The directory these result pages are stored in is determined by the -rf flag being passed when you launch Gatling. When launching via the 2_run.sh script above, this is set to /home/ubuntu/snowplow/results.



Share this article





Related Posts


Dodge Snow Plows
Dodge Snow Plows
Rear Snow Plows
Rear Snow Plows
Half ton Snow Plow
Half ton Snow Plow

Latest Posts
Cedar Rapids Crushers
Cedar Rapids…
Based on the classic ElJay design, the…
Used Meyers Snow Plow parts
Used Meyers Snow…
Users input a ZIP code and then use drop-down…
Crusher Rock
Crusher Rock
We now carry the New and Improved model…
Crane Truck Games
Crane Truck Games
Push the gas pedal and race in a city…
Rubber tyred Roller
Rubber tyred…
HAMM offers a wide range of products…
Search
Featured posts
  • Dodge Snow Plows
  • Rear Snow Plows
  • Half ton Snow Plow
  • Deep Snow Plowing
  • Snow Plow Wiki
  • Ford F350 Snow Plows
  • Poly Snow Plow Blades
  • Best Snow Plow vehicle
  • Snow Plows for Jeep
Copyright © 2025 l www.mtnvalleyequip.com. All rights reserved.