産業技術総合研究所
⾼高野了了成
2015年年6⽉月29⽇日
A  Look  Inside  Googleʼ’s  Data  
Center  Networks
ネタ元:Amin Vahdat, ”A Look Inside Google’s Data
Center Network”, 2015 Open Network Summit Keynote.
http://www.theplatform.net/2015/06/19/inside-a-decade-
of-google-homegrown-datacenter-networks/
Googleʼ’s  warehouse-‐‑‒scale  
networking  infrastructure
2
•  B4  [SIGCOMM  2013]
•  Andromeda  [ONS  2014]
•  Firehose  -‐‑‒>  Watchtower  -‐‑‒>  Saturn  -‐‑‒>  Jupiter  
[ONS  2015,  SIGCOMM  2015]
B4:  Globally-‐‑‒deployed  SDN  WAN
3
Google Confidential and Proprietary
Google's OpenFlow WAN
Google Confidential and Proprietary
G-Scale Network Hardware
● Built from merchant silicon
○ 100s of ports of
nonblocking 10GE
● OpenFlow support
● Open source routing stacks for
BGP, ISIS
● Does not have all features
○ No support for AppleTalk...
● Multiple chassis per site
○ Fault tolerance
○ Scale to multiple Tbps
⽬目的:リンク増設の削減、⾼高価な
ネットワーク機器からの脱却
結果:リンク帯域使⽤用率率率30%→  
70%を独⾃自スイッチで実現
OpenFlow@Google,  2012  Open  Network  Summit
B4:  Globally-‐‑‒deployed  SDN  WAN
•  設計⽅方針
–  アプリケーションはネットワーク帯域の使⽤用を事前
予約
–  アプリケーションの優先度度と使⽤用帯域をもとに中央
集権的に経路路計算し、スイッチに経路路注⼊入
•  所感
–  光パスとの相性はよい
–  OpenFlowの代わりにOGF  NSI標準を利利⽤用すること
も可能では
4
Andromeda  network  virtualization
•  Goal:  the  raw  performance  of  the  underlying  
network  while  simultaneously  exposing  NFV
5
http://googlecloudplatform.blogspot.jp/2014/04/enter-andromeda-
zone-google-cloud-platforms-latest-networking-stack.html
Traffic  generated  by  servers  in  
Google  DCs
6
Motivation
•  Traditional  network  architectures  could  not  
keep  up  with  bandwidth  demands  in  the  DC
•  Operational  complexity  of  “box-‐‑‒centric”  
deployment
•  Inspired  by  server  and  storage  scale  out,  
employed  three  principles  to  redesign  DCN
–  Clos  Technologies
–  Merchant  Silicon
–  Centralized  Control
7
Amdahlʼ’s  lesser  known  law  (late  1960ʼ’s):  
1Mbit/sec  of  IO  for  every  1Mhz  of  
computation  in  parallel  computing
1  server  (64*2.5  Ghz)  :  100  Gb/sec
50k  servers  :  5  Pb/sec!!
Google  DCN  Over  Time
8
Datacenter
generation
Year Merchant
Silicon
ToR
configuration
Aggregation
block
Spine
block
Fabric
speed
Host
speed
Aggregate
bandwidth
Four-Post CRs 2004 48x1G - - 10G 1G 2Tb/sec
Firehose 1.0 2005 8x10G,
4x10G
2x10G up,
24x1G down
2x32x10G 32x10G 10G 1G 10Tb/sec
Firehost 1.1 2006 8x10G 4x10G up,
48x1G down
64x10G 32x10G 10G 1G 10Tb/sec
Watchtower 2008 16x10G 4x10G up,
48x1G down
4x128x10G 128x10G 10G 1G 82Tb/sec
Saturn 2009 24x10G 24x10G 4x288x10G 288x10G 10G 10G 207Tb/sec
Jupiter 2012 16x40G 16x40G 8x128x40G 128x40G 10G/40G 10G/40G 1.3Pb/sec
(- 2004) (2005 -)
9
Datacenter
generation
Year Merchant
Silicon
ToR
configuration
Aggregation
block
Spine
block
Fabric
speed
Host
speed
Aggregate
bandwidth
Firehose 1.0 2005 8x10G,
4x10G
2x10G up,
24x1G down
2x32x10G 32x10G 10G 1G 10Tb/sec
Firehost 1.1 2006 8x10G 4x10G up,
48x1G down
64x10G 32x10G 10G 1G 10Tb/sec
10
Datacenter
generation
Year Merchant
Silicon
ToR
configuration
Aggregation
block
Spine
block
Fabric
speed
Host
speed
Aggregate
bandwidth
Watchtower 2008 16x10G 4x10G up,
48x1G down
4x128x10G 128x10G 10G 1G 82Tb/sec
(four  edge  aggr.  switches)
11
Datacenter
generation
Year Merchant
Silicon
ToR
configuration
Aggregation
block
Spine
block
Fabric
speed
Host
speed
Aggregate
bandwidth
Saturn 2009 24x10G 24x10G 4x288x10G 288x10G 10G 10G 207Tb/sec
12
Datacenter
generation
Year Merchant
Silicon
ToR
configuration
Aggregation
block
Spine
block
Fabric
speed
Host
speed
Aggregate
bandwidth
Jupiter 2012 16x40G 16x40G 8x128x40G 128x40G 10G/40G 10G/40G 1.3Pb/sec
Jupiter  Superstack
13
【参考】Facebook  6-‐‑‒pack  open  
hardware  modular  switch
14
•  40Gbps  x  64  ports
•  Dual-‐‑‒stage  clos  network
•  Linux-‐‑‒based  Operating  System  
(FBOSS)  
25/50  Gigabit  Ethernet
•  25  Gigabit  Ethernet  Consortium
–  Arista,  Broadcom,  Google,  Mellanox,  Microsoft
–  http://25gethernet.org/
•  2550100  Alliance
–  http://www.2550100.com/
15

A Look Inside Google’s Data Center Networks