Azure Big Data Architecture

I was talking to a fellow architect who wanted to know how to store data for Analysis . Presented & walked him through the two standard Azure soltutions, one based on SQL DW & other using the new Azure Data Lake Storage Gen2 . The Arcitecture for Ingesting data into SQL DW is well tested and defined.

However , with the introduction of ADLS Gen2 the obvious question is: ADLS Gen 2 is essentially a storage and you need other technologies to connect the data , read it & work on it. There are quite a few options:

HDInsight ETL at scale overview
  • Azure SQL DW – wait, what? yep. Azure SQL DW supports an exciting feature called polybase. The Polybase allows data external to SQL DW be accessible through SQL DW (wow!!)
  • Azure Data Factory – ADF provides a connector to connect to the ADLS Gen2 storage and helps customers to transform and orchestrate data movement

The accessing data from ADLS Gen2 is not limited to technologies but mostly by the developer skills. There is atleast one technology to access ADLS Gen2 data based on the skills the developer has.