collectd & PostgreSQL

       Mark Wong
 markwkm@postgresql.org
 mark.wong@myemma.com
         PDXPUG


    November 17, 2011
My Story



     • How did I get to collectd?
     • What is collectd
     • Hacking collectd
     • Using collectd with Postgres
     • Visualizing the data




    markwkm (PDXPUG)            collectd & PostgreSQL   November 17, 2011   2 / 43
Brief background




     • Working at a little company called Emma http://myemma.com
     • Collect performance data from production systems




    markwkm (PDXPUG)           collectd & PostgreSQL       November 17, 2011   3 / 43
What did we have?



     • A database with over 1 million database objects
           • >500,000 tables
           • >1,000,000 indexes

     • Tables alone generate 11,000,000 data point per sample




    markwkm (PDXPUG)            collectd & PostgreSQL           November 17, 2011   4 / 43
What did we try?


  Only free things:
      • Cacti http://www.cacti.net/
      • Ganglia http://ganglia.info/
      • Munin http://munin-monitoring.org/
      • Reconnoiter https://labs.omniti.com/labs/reconnoiter
      • Zenoss http://community.zenoss.org/




    markwkm (PDXPUG)         collectd & PostgreSQL      November 17, 2011   5 / 43
What doesn’t work

  Dependency on RRDtool; can’t handle more than hundreds of thousands of
  metrics (Application Buffer-Cache Management for Performance: Running the
  World’s Largest MRTG by David Plonka, Archit Gupta and Dale Carder, LISA
  2007):
      • Cacti
      • Ganglia
      • Munin
      • Reconnoiter
      • Zenoss



     markwkm (PDXPUG)           collectd & PostgreSQL          November 17, 2011   6 / 43
Reconnoiter almost worked for us

  Pro’s:
      • Write your own SQL queries to collect data from Postgres
      • Used Postgres instead of RRDtool for storing data
      • JavaScript based on-the-fly charting
      • Support for integrating many other health and stats collection solutions
  Con’s:
      • Data collection still couldn’t keep up; maybe needed more tuning
      • Faster hardware? (using VM’s)
      • More hardware? (scale out MQ processes)


     markwkm (PDXPUG)             collectd & PostgreSQL            November 17, 2011   7 / 43
Couldn’t bring myself to try anything else



      • Hands were tied, no resources available to help move forward.
      • Can we build something light weight?
      • Played with collectd (http://collectd.org/) while evaluating
          Reconnoiter




     markwkm (PDXPUG)            collectd & PostgreSQL            November 17, 2011   8 / 43
What is collectd?



          collectd is a daemon which collects system performance
          statistics periodically and provides mechanisms to store the
          values in a variety of ways, for example in RRD files.

  http://collectd.org/




     markwkm (PDXPUG)               collectd & PostgreSQL            November 17, 2011   9 / 43
Does this look familiar?




  Note: RRDtool is an option, not a requirement
     markwkm (PDXPUG)                  collectd & PostgreSQL   November 17, 2011   10 / 43
What is special about collectd?

  From their web site:
      •         it’s written in C for performance and portability
      •         includes optimizations and features to handle hundreds
                of thousands of data sets
      • PostgreSQL plugin enables querying the database
      • Can collect most operating systems statistics (I say “most” because I
         don’t know if anything is missing)
      • Over 90 total plugins
         http://collectd.org/wiki/index.php/Table_of_Plugins


     markwkm (PDXPUG)             collectd & PostgreSQL           November 17, 2011   11 / 43
collectd data description

      • time - when the data was collected
      • interval - frequency of data collection
      • host - server hostname
      • plugin - collectd plugin used
      • plugin instance - additional plugin information
      • type - type of data collected for set of values
      • type instance - unique identifier of the metric
      • dsnames - names for the values collected
      • dstypes - type of data for values collected (e.g. counter, gauge, etc.)
      • values - array of values collected

     markwkm (PDXPUG)              collectd & PostgreSQL            November 17, 2011   12 / 43
PostgreSQL plugin configuration
  Define custom queries in collectd.conf:

  LoadPlugin postgresql
  <Plugin postgresql>
     <Query magic>
         Statement "SELECT magic FROM wizard;"
         <Result>
             Type gauge
             InstancePrefix "magic"
             ValuesFrom magic
         </Result>
     </Query>
  ...

     markwkm (PDXPUG)             collectd & PostgreSQL   November 17, 2011   13 / 43
. . . per database.

...
   <Database bar>
       Interval 60
       Service "service_name"
       Query backend # predefined
       Query magic_tickets
   </Database>
</Plugin>


Full details at
http://collectd.org/wiki/index.php/Plugin:PostgreSQL

   markwkm (PDXPUG)       collectd & PostgreSQL        November 17, 2011   14 / 43
Hurdles



  More meta data:
      • Need a way to save schema, table, and index names; can’t differentiate
        stats between tables and indexes
      • Basic support of meta data in collectd but mostly unused
      • How to store data in something other than RRDtool




     markwkm (PDXPUG)            collectd & PostgreSQL          November 17, 2011   15 / 43
Wanted: additional meta data


  Hack the PostgreSQL plugin to create meta data for:
      • database - database name (maybe not needed, same as
         plugin instance)
      • schemaname - schema name
      • tablename - table name
      • indexname - index name
      • metric - e.g. blks hit, blks read, seq scan, etc.




     markwkm (PDXPUG)           collectd & PostgreSQL         November 17, 2011   16 / 43
Another database query for collecting a table statistic



  <Query table_stats>
      SELECT schemaname, relname, seq_scan
      FROM pg_stat_all_tables;
  <Query>




     markwkm (PDXPUG)       collectd & PostgreSQL   November 17, 2011   17 / 43
Identify the data



  <Result>
      Type counter
      InstancePrefix "seq_scan"
      InstancesFrom "schemaname" "relname"
      ValuesFrom "seq_scan"
  </Result>




     markwkm (PDXPUG)       collectd & PostgreSQL   November 17, 2011   18 / 43
Meta data specific parameters


  <Database postgres>
      Host "localhost"
      Query table_stats
      SchemanameColumn 0
      TablenameColumn 1
  </Database>



  Note: The database name is set by what is specified in the <Database>tag, if
  it is not retrieved by the query.

     markwkm (PDXPUG)            collectd & PostgreSQL          November 17, 2011   19 / 43
Example data

     • time: 2011-10-20 18:04:17-05
     • interval: 300
     • host: pong.int
     • plugin: postgresql
     • plugin instance: sandbox
     • type: counter
     • type instance: seq scan-pg catalog-pg class
     • dsnames: {value}
     • dstypes: {counter}
     • values: {249873}

    markwkm (PDXPUG)              collectd & PostgreSQL   November 17, 2011   20 / 43
Example meta data



     • database: sandbox
     • schemaname: pg catalog
     • tablename: pg class
     • indexname:
     • metric: seq scan




    markwkm (PDXPUG)            collectd & PostgreSQL   November 17, 2011   21 / 43
Now what?



  Hand’s were tied (I think I mentioned that earlier); open sourced work to date:

      • collectd forked with patches
        https://github.com/mwongatemma/collectd
      • YAMS https://github.com/myemma/yams




     markwkm (PDXPUG)              collectd & PostgreSQL            November 17, 2011   22 / 43
Yet Another Monitoring System




markwkm (PDXPUG)             collectd & PostgreSQL   November 17, 2011   23 / 43
Switching hats and boosting code




  Using extracurricular time working on equipment donated to Postgres from
  SUN, IBM, and HP to continue proofing collectd changes.




     markwkm (PDXPUG)            collectd & PostgreSQL           November 17, 2011   24 / 43
How am I going to move the data?

  Options from available write plugins; guess which I used:
      • Carbon - Graphite’s storage API to Whisper
        http://collectd.org/wiki/index.php/Plugin:Carbon
      • CSV http://collectd.org/wiki/index.php/Plugin:CSV
      • Network - Send/Receive to other collectd daemons
        http://collectd.org/wiki/index.php/Plugin:Network
      • RRDCacheD http://collectd.org/wiki/index.php/Plugin:RRDCacheD
      • RRDtool http://collectd.org/wiki/index.php/Plugin:RRDtool
      • SysLog http://collectd.org/wiki/index.php/Plugin:SysLog
      • UnixSock http://collectd.org/wiki/index.php/Plugin:UnixSock
      • Write HTTP - PUTVAL (plain text), JSON
        http://collectd.org/wiki/index.php/Plugin:Write_HTTP

     markwkm (PDXPUG)              collectd & PostgreSQL      November 17, 2011   25 / 43
Process of elimination

  If RRDtool (wriiten in C) can’t handle massive volumes of data, a Python
  RRD like database probably can’t either:
       • Carbon
       • CSV
       • Network
       • RRDCacheD
       • RRDtool
       • SysLog
       • UnixSock
       • Write HTTP - PUTVAL (plain text), JSON

     markwkm (PDXPUG)             collectd & PostgreSQL           November 17, 2011   26 / 43
Process of elimination


  Writing to other collectd daemons or just locally doesn’t seem useful at the
  moment:
      • CSV
      • Network
      • SysLog
      • UnixSock
      • Write HTTP - PUTVAL (plain text), JSON




     markwkm (PDXPUG)              collectd & PostgreSQL            November 17, 2011   27 / 43
Process of elimination



  Let’s try CouchDB’s RESTful JSON API!
       • CSV
       • SysLog
       • Write HTTP - PUTVAL (plain text), JSON




     markwkm (PDXPUG)          collectd & PostgreSQL   November 17, 2011   28 / 43
Random: What Write HTTP PUTVAL data looks like
  Note: Each PUTVAL is a single line but is broken up into two lines to fit onto
  the slide.

  PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_octets
      interval=10 1251533299:197141504:175136768
  PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_ops
      interval=10 1251533299:10765:12858
  PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_time
      interval=10 1251533299:5:140
  PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_merged
      interval=10 1251533299:4658:29899


     markwkm (PDXPUG)             collectd & PostgreSQL           November 17, 2011   29 / 43
Random: What the Write HTTP JSON data looks like
  Note: Write HTTP packs as much data as it can into a 4KB buffer.
   [ {
       "values": [197141504, 175136768],
       "dstypes": ["counter", "counter"],
       "dsnames": ["read", "write"],
       "time": 1251533299,
       "interval": 10,
       "host": "leeloo.lan.home.verplant.org",
       "plugin": "disk",
       "plugin_instance": "sda",
       "type": "disk_octets",
       "type_instance": ""
     }, ... ]
     markwkm (PDXPUG)           collectd & PostgreSQL         November 17, 2011   30 / 43
I didn’t know anything about CouchDB at the time



     • Query interface not really suited for retrieving data to visualize
     • Insert performance not suited for millions of metrics of data over short
         intervals (can insert same data into Postgres several orders of
         magnitude faster)




    markwkm (PDXPUG)               collectd & PostgreSQL            November 17, 2011   31 / 43
Now where am I going to put the data?



  Hoping that using the Write HTTP is still a good choice:
      • Write an ETL
                •   Table partitioning logic; creation of partition tables
                •   Transform JSON data into INSERT statements
      • Use Postgres




     markwkm (PDXPUG)                     collectd & PostgreSQL              November 17, 2011   32 / 43
Database design
                Table "collectd.value_list"
       Column      |           Type           | Modifiers
  -----------------+--------------------------+-----------
   time            | timestamp with time zone | not null
   interval        | integer                  | not null
   host            | character varying(64)    | not null
   plugin          | character varying(64)    | not null
   plugin_instance | character varying(64)    |
   type            | character varying(64)    | not null
   type_instance   | character varying(64)    |
   dsnames         | character varying(512)[] | not null
   dstypes         | character varying(8)[]   | not null
   values          | numeric[]                | not null
    markwkm (PDXPUG)        collectd & PostgreSQL     November 17, 2011   33 / 43
Take advantage of partitioning




  At least table inheritance in Postgres’ case; partition data by plugin




     markwkm (PDXPUG)               collectd & PostgreSQL             November 17, 2011   34 / 43
Child table
               Table "collectd.vl_postgresql"
       Column      |           Type            | Modifiers
  -----------------+--------------------------+-----------
   ...
   database        | character varying(64)     | not null
   schemaname      | character varying(64)     |
   tablename       | character varying(64)     |
   indexname       | character varying(64)     |
   metric          | character varying(64)     | not null
  Check constraints:
      "vl_postgresql_plugin_check" CHECK (plugin::text =
                                           ’postgresql’::text)
  Inherits: value_list
     markwkm (PDXPUG)       collectd & PostgreSQL      November 17, 2011   35 / 43
How much partitioning?


  Lots of straightforward options:
      • Date
      • Database
      • Schema
      • Table
      • Index
      • Metric




     markwkm (PDXPUG)                collectd & PostgreSQL   November 17, 2011   36 / 43
Back to the ETL


  Parameters set for fastest path to working prototype:
      • Keeping using HTTP POST (Write HTTP plugin) for HTTP protocol
        and JSON
      • Use Python for built in HTTP Server and JSON parsing (Emma is
        primarily a Python shop)
      • Use SQLAlchemy/psycopg2




    markwkm (PDXPUG)          collectd & PostgreSQL       November 17, 2011   37 / 43
Back again to the ETL

  Python didn’t perform; combination of JSON parsing, data transformation,
  and INSERT performance still several orders of magnitude below acceptable
  levels:
       • redis to queue data to transform
       • lighttpd for the HTTP interface
       • fastcgi C program to push things to redis
       • multi-threaded C program using libpq for Postgres API
                •   pop data out of redis
                •   table partitioning creation logic
                •   transform JSON data into INSERT statements


     markwkm (PDXPUG)                  collectd & PostgreSQL     November 17, 2011   38 / 43
Success?




     • Table statistics for 1 million tables collect in approximately 12 minutes.
     • Is that acceptable?
     • Can we go faster?




    markwkm (PDXPUG)              collectd & PostgreSQL             November 17, 2011   39 / 43
If you don’t have millions of data


  Easier ways to visualize the data:
       • RRDtool
       • RRDtool compatible front-ends
         http://collectd.org/wiki/index.php/List_of_front-ends
       • Graphite with the Carbon and Whisper combo
         http://graphite.wikidot.com/
       • Reconnoiter




     markwkm (PDXPUG)       collectd & PostgreSQL      November 17, 2011   40 / 43
__      __
          / ~~~/  . o O ( Thank you! )
    ,----(       oo     )
  /        __      __/
 /|            ( |(
^     /___ / |
    |__|    |__|-"




  markwkm (PDXPUG)       collectd & PostgreSQL   November 17, 2011   41 / 43
Acknowledgements

  Hayley Jane Wakenshaw

              __      __
            / ~~~/ 
      ,----(       oo     )
    /        __      __/
   /|            ( |(
  ^     /___ / |
      |__|    |__|-"



     markwkm (PDXPUG)         collectd & PostgreSQL   November 17, 2011   42 / 43
License



  This work is licensed under a Creative Commons Attribution 3.0 Unported
  License. To view a copy of this license, (a) visit
  http://creativecommons.org/licenses/by/3.0/us/; or, (b) send a
  letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco,
  California, 94105, USA.




     markwkm (PDXPUG)            collectd & PostgreSQL          November 17, 2011   43 / 43

collectd & PostgreSQL

  • 1.
  • 2.
    My Story • How did I get to collectd? • What is collectd • Hacking collectd • Using collectd with Postgres • Visualizing the data markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 2 / 43
  • 3.
    Brief background • Working at a little company called Emma http://myemma.com • Collect performance data from production systems markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 3 / 43
  • 4.
    What did wehave? • A database with over 1 million database objects • >500,000 tables • >1,000,000 indexes • Tables alone generate 11,000,000 data point per sample markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 4 / 43
  • 5.
    What did wetry? Only free things: • Cacti http://www.cacti.net/ • Ganglia http://ganglia.info/ • Munin http://munin-monitoring.org/ • Reconnoiter https://labs.omniti.com/labs/reconnoiter • Zenoss http://community.zenoss.org/ markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 5 / 43
  • 6.
    What doesn’t work Dependency on RRDtool; can’t handle more than hundreds of thousands of metrics (Application Buffer-Cache Management for Performance: Running the World’s Largest MRTG by David Plonka, Archit Gupta and Dale Carder, LISA 2007): • Cacti • Ganglia • Munin • Reconnoiter • Zenoss markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 6 / 43
  • 7.
    Reconnoiter almost workedfor us Pro’s: • Write your own SQL queries to collect data from Postgres • Used Postgres instead of RRDtool for storing data • JavaScript based on-the-fly charting • Support for integrating many other health and stats collection solutions Con’s: • Data collection still couldn’t keep up; maybe needed more tuning • Faster hardware? (using VM’s) • More hardware? (scale out MQ processes) markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 7 / 43
  • 8.
    Couldn’t bring myselfto try anything else • Hands were tied, no resources available to help move forward. • Can we build something light weight? • Played with collectd (http://collectd.org/) while evaluating Reconnoiter markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 8 / 43
  • 9.
    What is collectd? collectd is a daemon which collects system performance statistics periodically and provides mechanisms to store the values in a variety of ways, for example in RRD files. http://collectd.org/ markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 9 / 43
  • 10.
    Does this lookfamiliar? Note: RRDtool is an option, not a requirement markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 10 / 43
  • 11.
    What is specialabout collectd? From their web site: • it’s written in C for performance and portability • includes optimizations and features to handle hundreds of thousands of data sets • PostgreSQL plugin enables querying the database • Can collect most operating systems statistics (I say “most” because I don’t know if anything is missing) • Over 90 total plugins http://collectd.org/wiki/index.php/Table_of_Plugins markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 11 / 43
  • 12.
    collectd data description • time - when the data was collected • interval - frequency of data collection • host - server hostname • plugin - collectd plugin used • plugin instance - additional plugin information • type - type of data collected for set of values • type instance - unique identifier of the metric • dsnames - names for the values collected • dstypes - type of data for values collected (e.g. counter, gauge, etc.) • values - array of values collected markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 12 / 43
  • 13.
    PostgreSQL plugin configuration Define custom queries in collectd.conf: LoadPlugin postgresql <Plugin postgresql> <Query magic> Statement "SELECT magic FROM wizard;" <Result> Type gauge InstancePrefix "magic" ValuesFrom magic </Result> </Query> ... markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 13 / 43
  • 14.
    . . .per database. ... <Database bar> Interval 60 Service "service_name" Query backend # predefined Query magic_tickets </Database> </Plugin> Full details at http://collectd.org/wiki/index.php/Plugin:PostgreSQL markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 14 / 43
  • 15.
    Hurdles Moremeta data: • Need a way to save schema, table, and index names; can’t differentiate stats between tables and indexes • Basic support of meta data in collectd but mostly unused • How to store data in something other than RRDtool markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 15 / 43
  • 16.
    Wanted: additional metadata Hack the PostgreSQL plugin to create meta data for: • database - database name (maybe not needed, same as plugin instance) • schemaname - schema name • tablename - table name • indexname - index name • metric - e.g. blks hit, blks read, seq scan, etc. markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 16 / 43
  • 17.
    Another database queryfor collecting a table statistic <Query table_stats> SELECT schemaname, relname, seq_scan FROM pg_stat_all_tables; <Query> markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 17 / 43
  • 18.
    Identify the data <Result> Type counter InstancePrefix "seq_scan" InstancesFrom "schemaname" "relname" ValuesFrom "seq_scan" </Result> markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 18 / 43
  • 19.
    Meta data specificparameters <Database postgres> Host "localhost" Query table_stats SchemanameColumn 0 TablenameColumn 1 </Database> Note: The database name is set by what is specified in the <Database>tag, if it is not retrieved by the query. markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 19 / 43
  • 20.
    Example data • time: 2011-10-20 18:04:17-05 • interval: 300 • host: pong.int • plugin: postgresql • plugin instance: sandbox • type: counter • type instance: seq scan-pg catalog-pg class • dsnames: {value} • dstypes: {counter} • values: {249873} markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 20 / 43
  • 21.
    Example meta data • database: sandbox • schemaname: pg catalog • tablename: pg class • indexname: • metric: seq scan markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 21 / 43
  • 22.
    Now what? Hand’s were tied (I think I mentioned that earlier); open sourced work to date: • collectd forked with patches https://github.com/mwongatemma/collectd • YAMS https://github.com/myemma/yams markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 22 / 43
  • 23.
    Yet Another MonitoringSystem markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 23 / 43
  • 24.
    Switching hats andboosting code Using extracurricular time working on equipment donated to Postgres from SUN, IBM, and HP to continue proofing collectd changes. markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 24 / 43
  • 25.
    How am Igoing to move the data? Options from available write plugins; guess which I used: • Carbon - Graphite’s storage API to Whisper http://collectd.org/wiki/index.php/Plugin:Carbon • CSV http://collectd.org/wiki/index.php/Plugin:CSV • Network - Send/Receive to other collectd daemons http://collectd.org/wiki/index.php/Plugin:Network • RRDCacheD http://collectd.org/wiki/index.php/Plugin:RRDCacheD • RRDtool http://collectd.org/wiki/index.php/Plugin:RRDtool • SysLog http://collectd.org/wiki/index.php/Plugin:SysLog • UnixSock http://collectd.org/wiki/index.php/Plugin:UnixSock • Write HTTP - PUTVAL (plain text), JSON http://collectd.org/wiki/index.php/Plugin:Write_HTTP markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 25 / 43
  • 26.
    Process of elimination If RRDtool (wriiten in C) can’t handle massive volumes of data, a Python RRD like database probably can’t either: • Carbon • CSV • Network • RRDCacheD • RRDtool • SysLog • UnixSock • Write HTTP - PUTVAL (plain text), JSON markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 26 / 43
  • 27.
    Process of elimination Writing to other collectd daemons or just locally doesn’t seem useful at the moment: • CSV • Network • SysLog • UnixSock • Write HTTP - PUTVAL (plain text), JSON markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 27 / 43
  • 28.
    Process of elimination Let’s try CouchDB’s RESTful JSON API! • CSV • SysLog • Write HTTP - PUTVAL (plain text), JSON markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 28 / 43
  • 29.
    Random: What WriteHTTP PUTVAL data looks like Note: Each PUTVAL is a single line but is broken up into two lines to fit onto the slide. PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_octets interval=10 1251533299:197141504:175136768 PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_ops interval=10 1251533299:10765:12858 PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_time interval=10 1251533299:5:140 PUTVAL leeloo.lan.home.verplant.org/disk-sda/disk_merged interval=10 1251533299:4658:29899 markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 29 / 43
  • 30.
    Random: What theWrite HTTP JSON data looks like Note: Write HTTP packs as much data as it can into a 4KB buffer. [ { "values": [197141504, 175136768], "dstypes": ["counter", "counter"], "dsnames": ["read", "write"], "time": 1251533299, "interval": 10, "host": "leeloo.lan.home.verplant.org", "plugin": "disk", "plugin_instance": "sda", "type": "disk_octets", "type_instance": "" }, ... ] markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 30 / 43
  • 31.
    I didn’t knowanything about CouchDB at the time • Query interface not really suited for retrieving data to visualize • Insert performance not suited for millions of metrics of data over short intervals (can insert same data into Postgres several orders of magnitude faster) markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 31 / 43
  • 32.
    Now where amI going to put the data? Hoping that using the Write HTTP is still a good choice: • Write an ETL • Table partitioning logic; creation of partition tables • Transform JSON data into INSERT statements • Use Postgres markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 32 / 43
  • 33.
    Database design Table "collectd.value_list" Column | Type | Modifiers -----------------+--------------------------+----------- time | timestamp with time zone | not null interval | integer | not null host | character varying(64) | not null plugin | character varying(64) | not null plugin_instance | character varying(64) | type | character varying(64) | not null type_instance | character varying(64) | dsnames | character varying(512)[] | not null dstypes | character varying(8)[] | not null values | numeric[] | not null markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 33 / 43
  • 34.
    Take advantage ofpartitioning At least table inheritance in Postgres’ case; partition data by plugin markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 34 / 43
  • 35.
    Child table Table "collectd.vl_postgresql" Column | Type | Modifiers -----------------+--------------------------+----------- ... database | character varying(64) | not null schemaname | character varying(64) | tablename | character varying(64) | indexname | character varying(64) | metric | character varying(64) | not null Check constraints: "vl_postgresql_plugin_check" CHECK (plugin::text = ’postgresql’::text) Inherits: value_list markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 35 / 43
  • 36.
    How much partitioning? Lots of straightforward options: • Date • Database • Schema • Table • Index • Metric markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 36 / 43
  • 37.
    Back to theETL Parameters set for fastest path to working prototype: • Keeping using HTTP POST (Write HTTP plugin) for HTTP protocol and JSON • Use Python for built in HTTP Server and JSON parsing (Emma is primarily a Python shop) • Use SQLAlchemy/psycopg2 markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 37 / 43
  • 38.
    Back again tothe ETL Python didn’t perform; combination of JSON parsing, data transformation, and INSERT performance still several orders of magnitude below acceptable levels: • redis to queue data to transform • lighttpd for the HTTP interface • fastcgi C program to push things to redis • multi-threaded C program using libpq for Postgres API • pop data out of redis • table partitioning creation logic • transform JSON data into INSERT statements markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 38 / 43
  • 39.
    Success? • Table statistics for 1 million tables collect in approximately 12 minutes. • Is that acceptable? • Can we go faster? markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 39 / 43
  • 40.
    If you don’thave millions of data Easier ways to visualize the data: • RRDtool • RRDtool compatible front-ends http://collectd.org/wiki/index.php/List_of_front-ends • Graphite with the Carbon and Whisper combo http://graphite.wikidot.com/ • Reconnoiter markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 40 / 43
  • 41.
    __ __ / ~~~/ . o O ( Thank you! ) ,----( oo ) / __ __/ /| ( |( ^ /___ / | |__| |__|-" markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 41 / 43
  • 42.
    Acknowledgements HayleyJane Wakenshaw __ __ / ~~~/ ,----( oo ) / __ __/ /| ( |( ^ /___ / | |__| |__|-" markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 42 / 43
  • 43.
    License Thiswork is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, (a) visit http://creativecommons.org/licenses/by/3.0/us/; or, (b) send a letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco, California, 94105, USA. markwkm (PDXPUG) collectd & PostgreSQL November 17, 2011 43 / 43