Secure Big Data Analytics
    Combining APIs, Security and Big Data




           +
               Data Center Software Division




1
Two Red Hot Trends - How do they Intersect?




         API Management                                Big Data Analytics
         • Enterprise extending reach                  • Increased Volume, Variety,



                                                ?
           through APIs                                  Velocity of unstructured data
         • API traffic overtaking web traffic          • Drivers: mobile, cloud, social
         • Defacto communication for                   • Tremendous ROI
           mobile to server




                   How does this effect application architecture to support growth?
2
Big Data Fundamentals

              Traditional Data Analysis                                   Big Data Analysis
                                                           Unstructured       Cluster
              Relational             Data                                                              Analyze
              Database             Warehouse                                                Organize
                                               Analyze
Transaction                Batch                                  Streaming




                                                                              Devices      (MapReduce)




    • Structured data                                       • Unstructured, variety of data: “mashup”
    • Data ~ GBs to TBs                                     • Data ~ TBs to PBs
    • Centralized: Data moves to analytics                  • Distributed: Analytics move to the data
    • Batch analytics                                       • Streaming analytics

                    Focus ventures in one of two         “Only business model tech has left”
                    areas: monetization of data or
                    infrastructure to enable
                    monetization of data                                       March 12, 2012
3
Today’s Big Data Tools & Hurdles

                                                                   New BI Tools

    “Big Data” includes tools like Hadoop, NOSQL technologies,
    massive parallel processing, and in-memory databases

    Existing Hurdles with Hadoop
    • Job Control - Enable clients to run jobs with security controls
    • Data On-ramping - Get data into Hadoop for processing, from internal
      sources, cloud services or network-connected devices
    • Data Off-ramping - Data availability to clients via APIs, suitable for
      mobile applications
    • Security and Compliance - Big Data processing provides PII protection,
      data security and PCI compliance

4
Connecting Data Movement:
    Back End to Device to ALL Departments
    1         Problem: Today’s platforms are                                      2           Problem: Data and Potential value
               fragmented and not securely                                                  locked in fragmented solutions inhibit
                  connected, limiting scale                                                              E2E analytics

     Dept           Dept                       Dept          Dept                      Dept            Dept            Dept         Dept
      A              B                          A             B                         A               B               A            B




         Retail platform                     Home Energy Platform                      Telco Service Provider               Smart City
    10k devices, 1M customers              300K home pilot in Germany                 Real-time CDR: 12TB/ day       3000+ cameras, 1PB/3mo




                                                                    API Control Point



                                                                Analytics

                       Edge Devices



    NB/ULT Phone     Cameras     Kiosk   PoS    DS                      API Control Point




5
API/Service Gateway Fundamentals



                                  Service                       API                            Data
                                 Mediation                   Management                   Transformation
    • Consistent policy enforcement
      for API CENTRALIZED across                           Service Gateway
                                                                              Central Proxy
      departments           Enterprise
    • Use Models: CSB, ESB-light,
      Edge Security, API Gateway



      Monetization/Charge Back           App Service Gov & Integration       Security, Access, Compliance    Developer Community

       • Meter usage                     • API management                    • Edge threat protection       • Configuration not code
       • Throttle per SLAs               • Policy creation & exe             • Data Loss Protection         • Discovery of aggregated
       • API Analytics                   • Legacy & SOA integration          • Federated ID Brokering         services from IT
                                         • Orchestrate & transform           • PCI PII Data Tokenization    • Meta data
                                         • Protocol translation




                                     Move from Line of Business to “Enterprise” Wide
6                                           API Mgt & Utilization of Analytics
Last Mile Device Mobile Middleware




                              • High Performance
                              • Version Management
                              • Content Optimization
                              • Quality of Service
                              • Ubiquitous Compatibility
                              • External Cloud Service Support



7
Information Greed



                                                     • Greedy Users: Instant response from
                                                       touch-screens, context aware smart
                                                       phones, etc
                                                     • Greedy Business: Expect real time
                                                       intelligence on the consumer derived
                                                       from social, data warehouses, and
                                                       data mining




       Addressing this greed requires new thinking for how to build Composite Applications

8
Composite Distributed Application                                                          Apps




    •   Hybridized – New functionality with legacy code and data

    •   Location Independent- 1-n clouds (private and public) and
        datacenters simultaneously

    •   Knowledge Complete - Access to disparate “Big Data” warehouses owned by the business

    •   Contextual – Produces just-in-time results based on client context, e.g. identity and location

    •   Accessible & Performs – Produces data compatible with any client on any operating system,
        and does it instantaneously

    •   Secure and Compliant - Meets compliance and security requirements for data in transit and
        data at rest




        Realizing composite apps can be done with a service gateway, which secures, brokers and
        mediates data for API access, and a Hadoop Cluster which provides data analysis and processing

9
“APIfication” of real-time Hadoop datasets
                                                                                                                       PaaS Services
         Internal Client                                                                                             (Storage, RDMS)
             Users
                                               HTTP/REST
          Smartphone                        interactions with                                                           Network-
                &                             JSON Results                                                             Connected
          Tablet Clients                                                                                                 Devices


          Partner Web
            Services
                                                                                              Data On-ramping from the cloud with
         Types of Clients                                                                    selective protection (FPE/Tokenization)

                                                                Service Gateway
                     Gateway Control Point
                                                                                                                      DMZ




                                                                                                   Hadoop API
                                                                         Job Scheduler
            Legacy Apps and   RDBMS   IDM
             Web Services
                                                                                                 Metadata Server
                Existing Apps, Data and
                     Infrastructure
                                                                                         Node1           Node2              Node3
                                                                                                          HDFS




10
Pulling it all Together: Ref Arch for Composite Apps




11
Field Case Study




                        Secure ‘Big Data’ Storage and REST API
                        • Authenticate IP cameras based on IP address, 2-way
                          SSL or message security
                        • Codeless insertion and retrieval to and from HBase.
                          Drag and drop with no Java coding
                        • Expose ‘Big Data’ using a REST facade, ideal for
                          native mobile applications and partner services
                        • Provide a secure REST API with authentication and
                          authorization based on OAuth and internal identity
                          stores such as LDAP




12
Suggested Roadmap to Composite Apps & Big Data




13
More:                         www.cloudsecurity.intel.com



     Gartner Cloud Service Broker        API Patterns         Secure Big Data
              Hype Cycle                 White Paper           Solution Brief




14

Secure Big Data Analytics - Hadoop & Intel

  • 1.
    Secure Big DataAnalytics Combining APIs, Security and Big Data + Data Center Software Division 1
  • 2.
    Two Red HotTrends - How do they Intersect? API Management Big Data Analytics • Enterprise extending reach • Increased Volume, Variety, ? through APIs Velocity of unstructured data • API traffic overtaking web traffic • Drivers: mobile, cloud, social • Defacto communication for • Tremendous ROI mobile to server How does this effect application architecture to support growth? 2
  • 3.
    Big Data Fundamentals Traditional Data Analysis Big Data Analysis Unstructured Cluster Relational Data Analyze Database Warehouse Organize Analyze Transaction Batch Streaming Devices (MapReduce) • Structured data • Unstructured, variety of data: “mashup” • Data ~ GBs to TBs • Data ~ TBs to PBs • Centralized: Data moves to analytics • Distributed: Analytics move to the data • Batch analytics • Streaming analytics Focus ventures in one of two “Only business model tech has left” areas: monetization of data or infrastructure to enable monetization of data March 12, 2012 3
  • 4.
    Today’s Big DataTools & Hurdles New BI Tools “Big Data” includes tools like Hadoop, NOSQL technologies, massive parallel processing, and in-memory databases Existing Hurdles with Hadoop • Job Control - Enable clients to run jobs with security controls • Data On-ramping - Get data into Hadoop for processing, from internal sources, cloud services or network-connected devices • Data Off-ramping - Data availability to clients via APIs, suitable for mobile applications • Security and Compliance - Big Data processing provides PII protection, data security and PCI compliance 4
  • 5.
    Connecting Data Movement: Back End to Device to ALL Departments 1 Problem: Today’s platforms are 2 Problem: Data and Potential value fragmented and not securely locked in fragmented solutions inhibit connected, limiting scale E2E analytics Dept Dept Dept Dept Dept Dept Dept Dept A B A B A B A B Retail platform Home Energy Platform Telco Service Provider Smart City 10k devices, 1M customers 300K home pilot in Germany Real-time CDR: 12TB/ day 3000+ cameras, 1PB/3mo API Control Point Analytics Edge Devices NB/ULT Phone Cameras Kiosk PoS DS API Control Point 5
  • 6.
    API/Service Gateway Fundamentals Service API Data Mediation Management Transformation • Consistent policy enforcement for API CENTRALIZED across Service Gateway Central Proxy departments Enterprise • Use Models: CSB, ESB-light, Edge Security, API Gateway Monetization/Charge Back App Service Gov & Integration Security, Access, Compliance Developer Community • Meter usage • API management • Edge threat protection • Configuration not code • Throttle per SLAs • Policy creation & exe • Data Loss Protection • Discovery of aggregated • API Analytics • Legacy & SOA integration • Federated ID Brokering services from IT • Orchestrate & transform • PCI PII Data Tokenization • Meta data • Protocol translation Move from Line of Business to “Enterprise” Wide 6 API Mgt & Utilization of Analytics
  • 7.
    Last Mile DeviceMobile Middleware • High Performance • Version Management • Content Optimization • Quality of Service • Ubiquitous Compatibility • External Cloud Service Support 7
  • 8.
    Information Greed • Greedy Users: Instant response from touch-screens, context aware smart phones, etc • Greedy Business: Expect real time intelligence on the consumer derived from social, data warehouses, and data mining Addressing this greed requires new thinking for how to build Composite Applications 8
  • 9.
    Composite Distributed Application Apps • Hybridized – New functionality with legacy code and data • Location Independent- 1-n clouds (private and public) and datacenters simultaneously • Knowledge Complete - Access to disparate “Big Data” warehouses owned by the business • Contextual – Produces just-in-time results based on client context, e.g. identity and location • Accessible & Performs – Produces data compatible with any client on any operating system, and does it instantaneously • Secure and Compliant - Meets compliance and security requirements for data in transit and data at rest Realizing composite apps can be done with a service gateway, which secures, brokers and mediates data for API access, and a Hadoop Cluster which provides data analysis and processing 9
  • 10.
    “APIfication” of real-timeHadoop datasets PaaS Services Internal Client (Storage, RDMS) Users HTTP/REST Smartphone interactions with Network- & JSON Results Connected Tablet Clients Devices Partner Web Services Data On-ramping from the cloud with Types of Clients selective protection (FPE/Tokenization) Service Gateway Gateway Control Point DMZ Hadoop API Job Scheduler Legacy Apps and RDBMS IDM Web Services Metadata Server Existing Apps, Data and Infrastructure Node1 Node2 Node3 HDFS 10
  • 11.
    Pulling it allTogether: Ref Arch for Composite Apps 11
  • 12.
    Field Case Study Secure ‘Big Data’ Storage and REST API • Authenticate IP cameras based on IP address, 2-way SSL or message security • Codeless insertion and retrieval to and from HBase. Drag and drop with no Java coding • Expose ‘Big Data’ using a REST facade, ideal for native mobile applications and partner services • Provide a secure REST API with authentication and authorization based on OAuth and internal identity stores such as LDAP 12
  • 13.
    Suggested Roadmap toComposite Apps & Big Data 13
  • 14.
    More: www.cloudsecurity.intel.com Gartner Cloud Service Broker API Patterns Secure Big Data Hype Cycle White Paper Solution Brief 14

Editor's Notes

  • #2 Title: Enterprise API Best Practices (John) – ~15 slides – Talk for 25-30 minutes I. API Evolution – Where did they come from? (6-8 slides)  a. APIs evolved from SOA as services  b. Now they are pervasive – REST/JSON is king  c. 2011 API growth was huge – what will 2012 look like? d. API business model slides – which types of businesses benefit the most from APIs? (Blake to help with this) e. Comparison to website – APIs are the new “website” II. Categories: Open APIs versus Private APIs (4 slides)  a. Open APIs focus on developer on-boarding and platform enablement – name examples b. Private APIs (Enterprise APIs) focus on security, scalability, and availability – name examples of these (if you have some)  c. For Enterprise APIs, developer on-boarding is less of an issueIII. Hosted vs On-Premise (1-2 slides)  a. What are the pros and cons of hosting an API through an enabler service (Mashery/APIgee) versus doing it yourself.b. Hosted – Good for open APIs, as the developer community is more importantc. On-Premise – Good for private/enterprise grade APIs, as security and scalability are paramount   (Blake) – 8 to 10 slides – Talk for 10-15 minutes III. Enterprise Use cases – Types of things an Enterprise wants to do (1-2 slides)IV. The value of the gateway pattern – abstraction (consuming APIs) and security (protecting APIs) – (2 slides)V. Security overview – threats, trust, anti-malware, data loss prevention (1 slide)VI. Intel Expressway Product Pitch (2 slides)VII. Customer Examples (2 slides)
  • #3 This talk is about the confluence of two forces: API Management and Big Data, and how they affect the way Enterprises should think about building applications that support business growth in the futureWhat are the trends?For API Management, Enterprises are extending their reach through APIs and in some cases, API traffic is overtaking web traffic. API communication is the defacto way native mobile applications talk to the serverFor Big Data, Enterprises have always had data and further, have always had a lot of it. The notion of Big Data is the sudden increase in the volume and variety of data, including mobile data, social data and data in the cloud.  
  • #4 Explain traditional data ; Explain big data… Differences: Pain points: integrating different types of data, difficult to set up, cost of infrastructure / efficiencyThey are different, complementary – not substantially replacementTelco- cell phone – network optimization, cell phone usage (marketing)Government / smart cities – infrastructure, security camsFinance – credit risk; financial infromationWeb – indexing pages / recommendation engines
  • #9 User
  • #10 The application of the future is composite and distributed We have to completely lose the notion of monolithic and ‘siloed’ applications
  • #11  Schedule batch jobs for internal clients or partners with enterprise level security and access control Read the results of analytics jobs as API results On-ramp data from public cloud sources Protect data in motion with message level security, FPE and tokenization