ããã«ã¡ã¯ãSRE ã® @int128 ã§ãã
AWS ã§ã¯ãµã¼ãã¹åä½ï¼S3ãRDS ãªã©ï¼ã®ã³ã¹ãã¯æ¯è¼çç°¡åã«ã¢ãã¿ãªã³ã°ã§ãã¾ãããPod åä½ã®ã³ã¹ãã¨ãªãã¨é端ã«é£ãããªãã¾ãã Amazon EKS ã®ã³ã¹ã管çã«ã¤ãã¦ãå¤ãã®çµç¹ãåããããªèª²é¡ãæ±ãã¦ããã®ã§ã¯ãªãã§ããããã
ä»åã¯ãAWS Cost and Usage Reportsï¼CURï¼ã® Split cost allocation data æ©è½ãæ´»ç¨ãã¦ãPod åä½ã®ã³ã¹ãã Datadog ã§æ¥å¸¸çã«ã¢ãã¿ãªã³ã°ãã¦ããåãçµã¿ããç´¹ä»ãã¾ãã
解決ããã課é¡
Pod ã®ã³ã¹ããåããã«ãã
ã¹ã¿ãã£ãµããªã®ãµã¼ãã¹ã¯å¤§é¨åã AWS ã§éç¨ããã¦ãã¾ãã AWS ã®ãµã¼ãã¹åä½ã®ã³ã¹ãã«ã¤ãã¦ã¯ãAWS Budgets ã Datadog ã«é£æºãããã¨ã§ Datadog ã§æ¥å¸¸çã«ã¢ãã¿ãªã³ã°ã§ããããã«ãªã£ã¦ãã¾ãã ããããAmazon EKS ã«ãããã¤ããã¦ãã Pod ã«ã¤ãã¦ã¯ãã³ã¹ããåããã«ããã¨ãã課é¡ãããã¾ããã
å ·ä½çã«ã¯ã以ä¸ã®ãããªèª²é¡ãããã¾ããã
- ãµã¼ãã¹ãªã¼ãã¼ã Pod ã®ã³ã¹ããç¶ç¶çã«ææ¡ã§ããªã
- ããã¾ã§ Kubernetes ã®ã³ã¹ããå¯è¦åãããã¼ã«ã§ãã Kubecost ãå°å ¥ããã¦ãããPod ã®ã³ã¹ãã確èªã§ããããã«ãªã£ã¦ãã¾ããã ããããKubecost ã¯æä½ãè¤éã§ãç¡æãã¼ã¸ã§ã³ã§ã¯ãã¼ã¿ã®ä¿ææéã«å¶éããããªã©ã®èª²é¡ãããã¾ããã
- ç¶ç¶æ§ãæãããã«ã¯ãæ¥å¸¸çã«å©ç¨ãã¦ããã¢ãã¿ãªã³ã°ãã¼ã«ã® Datadog ã§ç¢ºèªãããã¨ãããã¼ãºãããã¾ããã
- SRE ã Pod ã®ã³ã¹ããåæã§ããªã
- SRE ã§ã¯ AWS ã®ã³ã¹ããæ¯æç¢ºèªãã¦ãã¾ãã EC2 ã¤ã³ã¹ã¿ã³ã¹ã®ã³ã¹ãã墿¸ããå ´åã¯è¦å ãåæããå¿ è¦ãããã¾ãã ããããPod åä½ã®ã³ã¹ããåãããªãããããããã¤å±¥æ´ã HPAï¼Horizontal Pod Autoscalerï¼ã®å¤åãªã©ããè¦å ãæ¨æ¸¬ããããããã¾ããã§ããã
- å®éçãªè°è«ãé£ãã
- Pod ã®ã³ã¹ãã«é¢ããè°è«ãçºçããå ´åãå®éã®ãã¼ã¿ã«åºã¥ããè°è«ãã§ãããæºä¸ã®ç©ºè«ã«ãªããããã¨ãã課é¡ãããã¾ããã
çæ³çãªç¶æ
Pod åä½ã®ã³ã¹ããæ¥å¸¸çã«ã¢ãã¿ãªã³ã°ã§ããã¨ã以ä¸ã®ã¡ãªãããããã¾ãã
- ãµã¼ãã¹ãªã¼ãã¼ã HPA ããªã½ã¼ã¹ãªã©ãæé©åãã¦ãã³ã¹ããç¶ç¶çã«æ¯ãè¿ããã¨ãã§ãã
- SRE ãæ¯æã®ã³ã¹ããã§ãã¯ã§ EC2 ã®å¢æ¸è¦å ãåæããéã« Pod ã®ã³ã¹ããæ´»ç¨ã§ãã
- åæãå¹´éã®äºç®è¦ç©ããã« Pod ã®ã³ã¹ããå©ç¨ã§ãã
解決ç: CUR Split cost allocation data ã®æ´»ç¨
AWS Cost and Usage Reportsï¼CURï¼ã¯ãAWS ã®è«æ±ã¨ã³ã¹ãã®è©³ç´°æ å ±ãæä¾ãããµã¼ãã¹ã§ãã ãã®ä¸ã§ã Split cost allocation data ã¨ããæ©è½ã使ãã¨ãEC2 ã¤ã³ã¹ã¿ã³ã¹ã®ã³ã¹ãã Pod ã®ãªã½ã¼ã¹ãªã¯ã¨ã¹ãéã«å¿ãã¦é åã§ãã¾ãã
ã³ã¹ãé åã®ä»çµã¿
CUR ã® Split cost allocation data ã§ã¯ã以ä¸ã®ãã¸ãã¯ã§ã³ã¹ããé åããã¾ãã
- ãªã½ã¼ã¹ãªã¯ã¨ã¹ãã«ããé
å
- EC2 ã¤ã³ã¹ã¿ã³ã¹ã®ã³ã¹ãã CPU ã¨ã¡ã¢ãªã«åå²ï¼CPU 1 ã³ã¢ : ã¡ã¢ãª 1GB = 9:1 ã®æ¯çã§é åï¼ãã¾ãã
- Pod ã® CPU ãªã¯ã¨ã¹ããã¡ã¢ãªãªã¯ã¨ã¹ãã«åºã¥ã㦠EC2 ã®ã³ã¹ããé åãã¾ãã
- æªä½¿ç¨ãªã½ã¼ã¹ã®é
å
- EC2 ã¤ã³ã¹ã¿ã³ã¹ã§æªä½¿ç¨ã® CPU ãã¡ã¢ãªã®ã³ã¹ããããªã½ã¼ã¹ãªã¯ã¨ã¹ãã®å²åã§åé åãã¾ãã
ä¾ãã°ãCPU 4 ã³ã¢ãã¡ã¢ãª 16GB ã§æéå価 $52 ã¨ããæ¶ç©ºã® EC2 ã¤ã³ã¹ã¿ã³ã¹ãããã¨ãã¾ãã ãã® EC2 ã¤ã³ã¹ã¿ã³ã¹ã®ã³ã¹ãã¯ä»¥ä¸ã®ããã«é åããã¾ãã
- CPU 4 ã³ã¢ã§ $36ï¼1 ã³ã¢ããã $9ï¼
- ã¡ã¢ãª 16GB ã§ $16ï¼1GB ããã $1ï¼
ãã Pod ã CPU 1 ã³ã¢ãã¡ã¢ãª 3GB ããªã¯ã¨ã¹ããã¦ããå ´åããã® Pod ã«ã¯ä»¥ä¸ã®ã³ã¹ããé åããã¾ãã
- ãªã½ã¼ã¹ãªã¯ã¨ã¹ãã«åºã¥ãé
å
- CPU ã³ã¹ã $9 à 1 ã³ã¢ = $9
- ã¡ã¢ãªã³ã¹ã $1 à 3GB = $3
- æªä½¿ç¨ãªã½ã¼ã¹ã®é å
ã³ã¹ãé åã®è©³ç´°ã¯ AWS ã®ããã¥ã¡ã³ããåç §ãã¦ãã ããã
Split cost allocation data ã®ãã¼ã¿æ§é
AWS ã³ã³ã½ã¼ã«ã§æ°ãã CUR ã使ããå®éã®ãã¼ã¿æ§é ã確èªãã¦ã¿ã¾ãããã CUR ã§ã¯ä»¥ä¸ãè¨å®ãã¾ãã
- åºåå½¢å¼ã¯ CSV ã Parquet ãªã©ãã鏿ã§ãã¾ããä»åã¯ãã¼ã«ã§ã®éè¨ãèæ ®ã㦠Parquet ã鏿ãã¾ãã
- Split cost allocation data ãæå¹ã«ãã¾ãã
- Include resource IDs ãæå¹ã«ãã¾ãã
ãã°ããããã¨ãS3 ãã±ããã« CUR ã®ãã¼ã¿ãä¿åããã¾ãã å®éã®ãã¼ã¿ã確èªããã¨ããã282 åãå«ã¾ãã¦ãã¾ããã CUR ã«ã¯ Pod 以å¤ã®ã³ã¹ããã¼ã¿ãå«ã¾ãããããPod ã®ã³ã¹ãã«é¢é£ããè¡ãåãæ½åºããå¿ è¦ãããã¾ãã
Pod ã®ã³ã¹ãã«é¢é£ãã主è¦ãªåã¯ä»¥ä¸ã®éãã§ãã
line_item_resource_id: Pod ã®ãªã½ã¼ã¹ IDï¼å½¢å¼:arn:aws:eks:REGION:ACCOUNT:pod/CLUSTER_NAME/NAMESPACE/POD_NAME/CONTAINER_IDï¼line_item_usage_type:APN1-EKS-EC2-vCPU-Hoursã¾ãã¯APN1-EKS-EC2-GB-Hourssplit_line_item_split_cost: Pod ã«é åãããã³ã¹ãsplit_line_item_unused_cost: æªä½¿ç¨ãªã½ã¼ã¹ã®é åã³ã¹ãresource_tags_aws_eks_cluster_name: Pod ãå±ãã EKS ã¯ã©ã¹ã¿ã®ååresource_tags_aws_eks_namespace: Pod ãå±ãã Namespace ã®ååresource_tags_aws_eks_deployment: Pod ãèå¥ããããã®ã¿ã°æ å ±resource_tags_aws_eks_workload_name: Pod ãèå¥ããããã®ã¿ã°æ å ±
éè¨åä½ã®æ£è¦å
Pod ã®ã³ã¹ããã©ã®åä½ã§éè¨ãããã¯éè¦ãªèª²é¡ã§ãã ãµã¼ãã¹ãªã¼ãã¼ã SRE ãå®éã«ã³ã¹ããã¼ã¿ãæ´»ç¨ããã¦ã¼ã¹ã±ã¼ã¹ãæ´çãããã¨ãéè¦ã§ãã ã¾ããKubernetes ã¯ã©ã¹ã¿ã§ã¯æ§ã 㪠Controller ã Pod ã管çãã¦ãããã¨ãèæ ®ããå¿ è¦ãããã¾ãã
å ·ä½çã«ã¯ã以ä¸ã®ãããªè¦ä»¶ãèæ ®ããå¿ è¦ãããã¾ãã
- ãµã¼ãã¹ãªã¼ãã¼ãèªèããããï¼ç´°ããããªãï¼
- SRE ãè¦å åæã§å©ç¨ããããï¼ç²ãããªãï¼
- CronJob ã Job ã®èæ ®
- DaemonSet ã StatefulSet ã®èæ ®
- GitHub Actions runner ã Argo Workflows ãªã©ã®ã«ã¹ã¿ã ãªã½ã¼ã¹ã®èæ ®
æçµçã«ã以ä¸ã®ã¯ã¨ãªã§éè¨åä½ã決å®ãã¦ãã¾ãã ãªããArgo Workflows ãªã©ã®ã«ã¹ã¿ã ãªã½ã¼ã¹ã§ã¯ãã¹ã¦ã® Pod åã®ãã¿ã¼ã³ãç¶²ç¾ ãããã¨ãé£ãããããæ¦ãéè¨ã§ããã°è¯ãã¨ãã¦ãã¾ãã
-- Extract POD_NAME from "arn:aws:eks:REGION:ACCOUNT:pod/CLUSTER_NAME/NAMESPACE/POD_NAME/CONTAINER_ID" CREATE OR REPLACE TEMPORARY MACRO extract_pod_name(line_item_resource_id) AS split_part(line_item_resource_id, '/', 4); SELECT -- ä»¥ä¸æç² CASE -- Deployment or Rollout WHEN resource_tags_aws_eks_deployment != '' THEN resource_tags_aws_eks_deployment -- DaemonSet or StatefulSet WHEN resource_tags_aws_eks_workload_type IN ('DaemonSet', 'StatefulSet') THEN resource_tags_aws_eks_workload_name -- ReplicaSet or Job WHEN resource_tags_aws_eks_workload_type != '' THEN regexp_replace(resource_tags_aws_eks_workload_name, '-\w+$', '') -- GitHub Actions WHEN resource_tags_aws_eks_namespace = 'arc-runners' THEN regexp_replace(extract_pod_name(line_item_resource_id), '-\w+-runner-.+$', '') WHEN resource_tags_aws_eks_namespace = 'arc-systems' THEN regexp_replace(extract_pod_name(line_item_resource_id), '-\w+-listener$', '') -- Argo Workflows WHEN resource_tags_aws_eks_namespace LIKE '%-workflow%' THEN regexp_replace(extract_pod_name(line_item_resource_id), '(-\w+-main)?-\d+.*$', '') ELSE extract_pod_name(line_item_resource_id) END AS workload_name, line_item_usage_type AS cost_usage_type FROM 'split-cost-allocation-data/*.parquet' WHERE resource_tags_aws_eks_cluster_name != '' AND resource_tags_aws_eks_namespace != '' AND split_line_item_split_cost > 0 GROUP BY resource_tags_aws_eks_cluster_name, resource_tags_aws_eks_namespace, resource_tags_aws_eks_workload_type, workload_name, line_item_usage_type ORDER BY cost DESC
ãã¼ã¿ãã¤ãã©ã¤ã³ã®æ§ç¯
S3 ãã±ããã«ä¿åããã CUR ã®ãã¼ã¿ãéè¨ãããµã¼ãã¹ãªã¼ãã¼ã Datadog ã§ç¢ºèªã§ããããã«ãããã¨ãç®æ¨ã§ãã ç¶ç¶æ§ã確ä¿ãããããæåã«ããéè¨ã§ã¯ãªãèªååãå¿ è¦ã§ãã
ä»å㯠GitHub Actions ã§ä»¥ä¸ã®ã¸ã§ãã宿å®è¡ãã¦ãã¾ãã
- S3 ãã±ãããã仿åã® Parquet ãã¡ã¤ã«ããã¦ã³ãã¼ããã¾ãã
- DuckDB ã§ SQL ãå®è¡ããçµæã CSV ãã¡ã¤ã«ã«ä¿åãã¾ãã
- send-datadog-action ã§ CSV ãã¡ã¤ã«ã®ãã¼ã¿ã Datadog ã«éä¿¡ãã¾ãã
å®éã®æ´»ç¨äºä¾
ç¾å¨ãDatadog ã§ä»¥ä¸ã®ã¡ããªã¯ã¹ãå©ç¨ã§ããããã«ãªã£ã¦ãã¾ãã
- ã¡ããªã¯ã¹å:
aws_eks_pod_monthly_cost - ã¡ããªã¯ã¹å¤: æåãã仿¥ã¾ã§ã«å®è¡ããã Pod ã®ã³ã¹ã
- å©ç¨å¯è½ãªã¿ã°:
kube_cluster_name: ã¯ã©ã¹ã¿åkube_namespace: Namespace åworkload_name: Pod ã®éè¨åä½workload_type: Pod ã®ãªã½ã¼ã¹ã®ç¨®é¡cost_usage_type: CPU ã³ã¹ãï¼apn1-eks-ec2-vcpu-hoursï¼ã¾ãã¯ã¡ã¢ãªã³ã¹ãï¼apn1-eks-ec2-gb-hoursï¼
ãµã¼ãã¹ãªã¼ãã¼ã Datadog ããã·ã¥ãã¼ãã« Pod ã³ã¹ããç°¡åã«è¿½å ã§ããããã« Powerpack ãç¨æãã¦ãã¾ãã


注æç¹
Pod éã®ã³ã¹ãæ¯è¼ã¯é¿ãã
CPU ã¨ã¡ã¢ãªã®ãªã½ã¼ã¹åä¾¡ã®æ¯çï¼9:1ï¼ã¯ AWS ãæ±ºããå¤ã§ãããPod éã®ç¸å¯¾çãªã³ã¹ãå¹çãæ¯è¼ããã«ã¯é©ãã¦ãã¾ããã ä¾ãã°ã以ä¸ã®ãã㪠Pod ãããã¨ãã¾ãã
- Pod A: CPU ãªã¯ã¨ã¹ããå°ãããã¡ã¢ãªãªã¯ã¨ã¹ãã大ãã
- Pod B: CPU ãªã¯ã¨ã¹ãã大ãããã¡ã¢ãªãªã¯ã¨ã¹ããå°ãã
ãã®å ´åãã©ã¡ãã® Pod ãå¹ççã§ãããã¯ä¸æ¦ã«å¤æã§ãã¾ããã Pod ãé ç½®ããã EC2 ã¤ã³ã¹ã¿ã³ã¹ã®ã¿ã¤ããæªä½¿ç¨ãªã½ã¼ã¹ã®ç¶æ³ãªã©ã«ãããã³ã¹ãå¹çã¯å¤§ããå¤ããå¯è½æ§ãããã¾ãã ãã®ãããPod éã®ã³ã¹ãæ¯è¼ã¯é¿ããæ¹ãããã§ãããã
ãã®ã¡ããªã¯ã¹ã¯èªãã¼ã ã® Pod ãé·æçã«ã¢ãã¿ãªã³ã°ããç¨éã§ä½¿ç¨ãããã¨ãæ¨å¥¨ãã¾ãã
éè¨åä½ã®å¶é
ç¾å¨ãAWS CUR ã®ãã¼ã¿ã«ã¯ Pod ã®ã©ãã«ãå«ã¾ããªããããPod ã®ã©ãã«ã«ããéè¨ã¯ã§ãã¾ããã ä¾ãã°ãPod ã®ã©ãã«ã«ã¯ãã¤ã¯ããµã¼ãã¹ã®ååããªã¼ãã¼ãã¼ã ãå«ã¾ããããã«ãã¦ãã¾ããããã®ãããªåä½ã§éè¨ã§ããã¨æ´ã«ä¾¿å©ã«ãªãã¨èãã¦ãã¾ãã
ã¾ã¨ã
AWS CUR ã® Split cost allocation data ãæ´»ç¨ãããã¨ã§ãããã¾ã§èª²é¡ã¨ãªã£ã¦ãã Pod åä½ã®ã³ã¹ãã¢ãã¿ãªã³ã°ãå®ç¾ã§ãã¾ããã ãã®åãçµã¿ã«ãã£ã¦ã以ä¸ã®ãããªå¹æãå¾ããã¦ãã¾ãã
- ç¶ç¶çãªã³ã¹ãç£è¦: ãµã¼ãã¹ãªã¼ãã¼ãæ¥å¸¸çã«ä½¿ç¨ãã Datadog ã§ Pod ã®ã³ã¹ãã確èªã§ãã
- å®éçãªæææ±ºå®: ãªã½ã¼ã¹èª¿æ´ãªã©ã®å¹æãæ°å¤ã§æ¸¬å®ã§ãã
åæ§ã®èª²é¡ãæ±ãã¦ããçµç¹ã®åèã«ãªãã°å¹¸ãã§ãã