On-Demand Webinar

  • Recipes for Success

    How to Build Continuous Ingest Pipelines

    Arvind Prabhakar, StreamSets CTO 

    Modern data infrastructures are fed by vast volumes of data, streamed from an ever-changing variety of sources.   Standard practice has been to store the data as ingested and force data cleaning onto each consuming application.  This approach saddles data scientists and analysts with substantial work, creates delays getting to insights and makes real-time or near-time analysis practically impossible.  


    In this session you will discover:


    • recipes for building automated ingest pipelines that implement continual in-stream sanitization so that data lands in stores ready to consume, regardless of the complexity of collecting it.  
    • methods for making your pipelines resistant to data drift - the inevitable changes in schema, semantics and infrastructure that break pipelines.  
    • open source tools that allow you to create and maintain these pipelines with little to no hand coding

Section 2: Spin Straw into Gold