Blog

Follow Mixnode on Twitter

Integration with Amazon S3

We're very excited to announce that as of today, Mixnode's integration with Amazon S3 is in general availability. You can now use Mixnode to write queries against the web and send the results to your own Amazon S3 buckets.

What is Amazon S3?

Amazon Simple Storage Service (S3) is a highly scalable, durable and cost-effective storage service in the cloud.

The reliability and seamless scalability of Amazon S3 make it the perfect choice for persisting the output of your queries for long-term or intermediate storage, regardless of data volume.

Benefits of Integrating Mixnode with Amazon S3

Mixnode's integration with Amazon S3 is especially beneficial in cases where your Mixnode query is expected to return large/massive amounts of data exceeding your instance's storage limits:

  • Extracting as much data as possible from the web for large-scale indexing purposes.
  • Building a large-scale data repository or augmenting an existing one.
  • Archiving portions of the web.

Additionally, many software packages allow for seamless streaming and processing of data from Amazon S3; therefore, S3 can also be used as an intermediate storage buffer for post-processing using other packages.

Last but not least, Mixnode's integration with S3 allows for storage in many popular data formats such as JSON and CSV and big data-friendly formats such as Apache AVRO, and Apache Parquet, among others. You can simply store the results in a format that best suits your use case and consume it later using software packages, modules and native libraries.

Give It a Try!

If you're interested, please give the Mixnode/Amazon S3 integration a try and contact us at hi@mixnode.com if you have any questions or comments.

Turn the web into a database!

Mixnode is a fast, flexible and massively scalable platform to extract and analyze data from the web.

or contact us at hi@mixnode.com