Quantcast
Channel: SCN : All Content - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 6745

SAP HANA - Hadoop Integration # 2

$
0
0

SAP HANA - Hadoop Integration

 

#1

http://scn.sap.com/community/developer-center/hana/blog/2013/05/20/sap-hana--hadoop-integration-1


In October 2012, SAP announced integration of Apache Hadoop into real-time data warehousing environments with a new "big data" bundle and go-to-market strategy with Cloudera, Hitachi Data Systems, Hortonworks, HP and IBM. The offering was based on the flagship SAP HANA® platform and combined the SAP® Sybase® IQ server, SAP® Data Integrator software and SAP® BusinessObjects™ business intelligence (BI) solutions.  It provided a comprehensive data warehousing solution for real-time insights across massive data sets from various sources. 

 

Applying MapReduce patterns to big data

  • Optimized repartition joins


  • Implementing a semi-join


  • Implementing a secondary sort


  • Sorting keys across multiple reducers


  • Reservoir sampling.

 

 

Streamlining HDFS for big data

  • Using Avro to store multiple small files


  • Picking the right compression codec for your data


  • Compression with HDFS, MapReduce, Pig, and Hive


  • Splittable LZOP with MapReduce, Hive, and Pig.


What does Hadoop bring to HANA?

  • Cost efficient data storage and processing for large volumes of structured, semi-structured, and unstructured data such as web logs, machine data, text data, call data records (CDRs), audio, video data.


  • Batch Processing.


  • Where fast response times are less critical than reliability and scalability.


  • Complex Information Processing.


  • Enable heavily recursive algorithms, machine learning, & queries that cannot be easily expressed in SQL.


  • Low Value Data Archive&Data stays available, though access is slower.


  • Post-hoc Analysis.


  • Mine raw data that is either schema-less or where schema changes over time.


SAP HANA / Hadoop Integration


Accommodate both structured and un-structured data.


Pre-process and load the structured billing data via Hadoop.

 

Combine structured and un-structured data and transfer it to SAP HANA via a Hadoop / HANA Connector.


SAP HANA enforces the business rules via stored procedures and its columnar database utilizing their in-memory capabilities.


SAP HANA enables any Company to take a deep dive and perform sophisticated analytics on their data providing their customers new insights into their data.


SAP HANA / Business Objects enables real time reporting and analysis for their customers.


Utilize the HANA In-memory capabilities and breadth of SAP Analytic applications to perform sophisticated analytics (e.g. unstructured text analysis)

Ÿ

SAP HANA is utilized for the

   1) Enforcement of the business rules,

   2) Analytics, and

   3) For generating Business Objects reports

 

SAP has integrated HANA with Hadoop, enabling customers to move data between Hive and Hadoop's Distributed File System and SAP HANA or SAP Sybase IQ server. It has also set up a "big-data" partner council, which will work to provide products that make use of HANA and Hadoop. One of the key partners is Cloudera. SAP wants it to be easy to connect to data, whether it's in SAP software or software from another vendor.

 

h1.jpg

 

SAP Data Services: Simple GUI build and run ETL process

h2.jpg


Processing Text to extract relevant data from Hadoop


1. Use SAP Data Services to extract:

  • Core entities (who, what, when, where, etc.)
  • Domains (voice of customer, public sector, enterprise, etc)
  • Sentiment analysis (strong positive, weak positive, neutral, weak negative, strong negative)

2. Perform transformations

  • Map text into pre-defined structures
  • Cleanse, match, de-duplicate data

3. Load results quickly into EDW

  • Map text to structure


h3.jpg

Data Federation between Hadoop and your Analytic Database


last.jpg

 

1. Big Data for Starters

http://scn.sap.com/community/developer-center/hana/blog/2013/04/26/big-data-for-starters

 

 

2. Advanced level - Tech deep dive on BIG DATA Technologies & Applications.

http://scn.sap.com/community/hana-in-memory/blog/2013/04/30/big-data-technologies-applications

 

Suggestions are most welcome.



Viewing all articles
Browse latest Browse all 6745

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>