SRE / DevOps / Kubernetes Weekly Reportまとめ#30(8/23~8/28) - 運び屋 (A carrier(forwarder) changed his career to an engineer)

この記事は2020/8/23~8/28に発行された下記3つのWeekly Reportを読み、備忘録兼リンク集として残したものです。
English Version of this blow is here.
DEVOPS WEEKLY ISSUE #504 August 23rd, 2020
- News
- Tools
SRE Weekly Issue #232 August 23rd, 2020
- Articles
- Outages
KubeWeekly #230 August 28th
Upcoming CNCF webinars

この記事は2020/8/23~8/28に発行された下記3つのWeekly Reportを読み、備忘録兼リンク集として残したものです。

誰かの情報源や検索工数削減などになれば幸いです。

DEVOPS WEEKLY ISSUE #504 August 23rd, 2020

SRE Weekly Issue #232 August 23rd, 2020

KubeWeekly #230 August 28th, 2020

English Version of this blow is here.

この記事を読んで疑問点や不明点があれば、URLから本文をご確認の上、ご指摘頂ければ幸いです。
理解が浅いジャンルも、とにかくコメントする様にしていますので、私の勘違いや説明不足による誤解も多々あろうかと思います。
情報量が多いので文字とリンクだけに絞っております。
各レポートで取り上げられている記事には2019年以前のものもあり、必ずしも最新のものという訳ではない様です。

DEVOPS WEEKLY ISSUE #504 August 23rd, 2020

News

A great post on the changing role of operations. Some good tips for those wondering what modern ops looks like, with tips on vendor management, outsourcing infrastructure and the importance of understanding sociotechnical systems.

タイトルは「The Future of Ops Jobs」。
Opsの今後の仕事について、下記3点の進行中の変化について触れて解説を行なっている。
- From monolith to microservices
- From monitoring to observability
- From magic autoinstrumentation to instrumenting with intent
筆者はインフラの問題に心が踊る方に、そうした問題が増えているのでインフラの会社にジョインすることを勧め、幸運を願っている。それ以外の場合は、以下の4つ観点でエンジニアのチームがコアビジネス価値を生み出すソフトウェアを出荷できるようにシステムを構築することを勧めている。ベンダーコントロールは別の業界でも良く聞いたのですが、エンジニアリングに昇華できれば確かに介在価値がありそう。
- Vendor engineering
- Product engineering
- Sociotechnical systems engineering
- Managing the portfolio of technical investments.

A good introduction to NAT networks, for anyone wanting to understand this area of networking better. Good diagrams and examples and lots of details.

タイトルは「How NAT traversal works」。
NATをテーマに、peer-to-peerのシンプルな接続から始まり、様々な問題/プロトコル/ファイアウォールなどのコンポーネントを解説している記事。
分量が多いので、途中を飛ばし読みしてしまった。後日、読み直し。

Metrics are used for lots of different purposes, including reporting to the top of an organisation. This post explores engineering KPIs for board room conversations.

タイトルは「How to Choose Software Development KPIs for Your Board Deck」。
CTO向けに、Board Meetingに用意するソフトウェア開発のKPIsと、その場で生産的な対話を行うポイントなどを解説している。
- Start with Engineering Success Metrics
- Drill Down with Revealing Engineering KPIs
- Put Engineering Metrics in Conversation
- Make Board Meetings Work for You

Ever wanted to ensure that messages between services are kept in order, with a retry mechanism for any lost messages? This post describes a specific pattern, but is also part of a set of articles on distributed computing patterns that’s worth exploring.

タイトルは「Single Socket Channel」。
ソフトウェア開発に関する筆者、スピーカー、評論家であるMartin Fowler氏のブログ。タイトルにあるSingle Socket Channelが解決する問題、解決方法を解説している。
解説の中で既に過去の同氏のブログで解説しているテーマをリンクしていて、とても良い。ウェブに関する技術の深掘りができる。

Incident reviews are increasingly common but often hard to do well. This video and detailed transcript has various tips for improving the process.

タイトルは「Improving Postmortems from Chores to Masterclass with Paul Osman」。
先々週、SRE Weekly Issue #231で取り上げているので割愛。

An ambitious idea for a new journal for Systems research. Definitely relevant to the interests of some readers of Devops Weekly I think.

タイトルは「A new journal for systems research」。
システム研究のレビュープロセスと公開モデルの現状の課題点を挙げ、改善策としてJournal of Systems Research(jsysr.org)の紹介、解説をしている記事。

Pulumi, the Infrastructure as Code tool, now supports using Open Policy Agent to validate the resulting resources. This post explores why and how.

タイトルは「Authoring CrossGuard Policy with Open Policy Agent (OPA)」。
OPA(Open Policy Agent)のRego言語サポートが、PulumiのCrossGuardポリシーとしてコードフレームワークに追加されたことを受け、Pulumi社が解説している記事。

Even if you’re not writing applications in Java, it’s often useful to have some knowledge of how logging works as you’ll probably end up running at least some Java applications. These posts provide a solid foundation.

タイトルは「Java Logging Tutorial: Basic Concepts to Help You Get Started(上記リンク)」と「Java Logging Best Practices: 10+ Tips You Should Know to Get the Most Out of Your Logs」。
1つ目の記事では、Javaでの既知のロギングのミスを回避するため、コードのロギングを適切に設定する方法に焦点を当て、以下をカバーしている。
- Logging abstraction layers for Java
- Out of the box Java logging capabilities
- Java logging libraries, their configuration, and usage
- Logging the important information
- Log centralization solutions.
2つ目の記事では、Javaロギングの以下14のベストプラクティスを解説している。
1. Use a Standard Logging Library
2. Select Your Appenders Wisely
3. Use Meaningful Messages
4. Logging Java Stack Traces
5. Logging Java Exceptions
6. Use Appropriate Log Level
7. Log in JSON
8. Keep the Log Structure Consistent
9. Add Context to Your Logs
10. Java Logging in Containers
11. Don’t Log Too Much or Too Little
12. Keep the Audience in Mind
13. Avoid Logging Sensitive Information
14. Use a Log Management Solution to Centralize & Monitor Java Logs

Tools

Tags are critical to managing AWS resources at scale. Awstaghelper provides a command line tool to ease adding and managing tags to and from CSV files across the wide range of AWS resources.

AWSの数百のリソースに数コマンドでタグ付けするOSSツール「Awstaghelper」のGitHubページ。

The GitOps Toolkit is a set of composable APIs and specialized tools that can be used to build a Continuous Delivery platform on top of Kubernetes. They should provide the underpinnings for the v2 of Flux, but could also be used to build other interesting high-level tools that take the same control loop approach.

Kubernetesの上に継続的デリバリープラットフォームを構築するために使用できる構成可能なAPIと専用ツールのセット「GitOps Toolkit」のioページ。

Kip is a Virtual Kubelet provider that allows a Kubernetes cluster to transparently launch pods onto their own cloud instances. Handy if you require additional workload isolation.

Kubernetesクラスターがポッドを独自のクラウドインスタンスに透過的に起動できるようにするVirtual Kubeletのプロバイダー「Kip(Kubernetes Cloud Instance Provider)」のGitHubページ。

SRE Weekly Issue #232 August 23rd, 2020

Articles

Incident updates, interruptions and the 30 minute window

An engineer’s observation of a really effective Incident Command pattern.

Dean Wilson

インシデント対応の体制、内部時計として30分を基準として持ち効果的に対応できていたことを分析、解説している記事。

Thoughts on STAMP

Here’s Lorin Hochstein’s take on the STAMP (Systems-Theoretic Accident Model and Processes) workshop he attended recently.

Lorin Hochstein

以前のブログで触れている筆者がMIT STAMP workshopに参加し、そのワークショップ後のSTAMP印象について述べている記事。
筆者のアイデアに対してポジティブな部分と、懐疑的な部分を明確にして取り入れていく姿勢は、いつも参考になる。

HRO and RE: a pragmatic perspective

What’s the difference between Resilience Engineering and High Reliability Organizations? This paper (and excellent summary) explains.

Torgeir Haavik, Stian Antonsen, Ragnar Rosness, and Andrew Hale (original paper)
Thai Wood — Resilience Roundup (summary)

論文「HRO and RE: a pragmatic perspectiveを取り上げて解説している。上記の通りですが、HRO=High Reliability Organization、RE=Resilience Engineering。

The Future of Ops Jobs

This one focuses on what I feel are really important parts of SRE, taken from the article’s subheadings:

● Vendor engineering
● Product engineering
● Sociotechnical systems engineering
● Managing the portfolio of technical investments

Charity Majors — Honeycomb

上記DEVOPS WEEKLY ISSUE #504で取り上げているため、割愛します。

Outage report 7 July 2020 – PythonAnywhere

Now that’s a for-serious incident report. Nice one, folks! This is an interesting case of theory-meets-reality for disaster planning.

giles — PythonAnywhere

PythonAnywher社の2017年7月以来の大規模障害のレポート。ストレージシステムのFailureが原因。

Outages

Equinix
- Equnix had a power failure in a London datacenter.
WhatsApp
Crunchyroll
Deliveroo
Google Cloud Platform
Squarespace
Spotify
- Looks like it may have been an expired TLS certificate.
G Suite

上記各社の障害情報

KubeWeekly #230 August 28th

The Headlines

Editor’s pick of the highlights from the past week.

Congratulations to the release team on getting Kubernetes 1.19 out the door. This release is all about extra time: the timelines were adjusted due to world events, and it will be the first to be supported for 12 months. This should allow an extra 30% of Kubernetes users to remain on a supported version on their regular upgrade cadence. The release includes 33 enhancements, including Ingress finally going to GA. Check out an interview with the release manager Taylor Dolezal on this week’s Kubernetes Podcast to learn more.

Kubernetes 1.19リリースの記事。COVID-19、George Floydの抗議活動などの出来事により通常のリリースサイクルと変わった。サポート期間が1年に変更されるなどの多くの変更が入っている。詳細は上記のinterview〜としてリンクされているKubernetes Podcastでも多く語られている。
上記のinterviewリンクは、Google社社員によるKubernetes Podcast。現在のCo-hostはCraig Box氏とAdam Glick氏。
Hashicorp社のsenior Developer Advocate、Kubernetes 1.19のrelease lead、CNCF AmbassadorであるTaylor Dolezal氏をゲストとして迎えている。
Kubenetes PodcastのゲストがOSSについて話している時の「Communication is difficult as DNS」のジョークがジワリと来ました。
News of the weekで気になったトピックは以下の通り。
- k3s to join the CNCF Sandbox
- Serverless Framework Knative component
- Palinurus, from Mailchannels
- Carvel
- The Kubernetes Handbook by Farhan Hasin Chowdhury

A Look Back at our FIRST KubeCon + CloudNativeCon Virtual Conference

Priyanka Sharma, CNCF

Priyanka Sharma recaps the first virtual KubeCon + CloudNativeCon and the event’s success thanks to our amazing community of doers – builders, operators and advocates! She writes, “we are so thrilled that the cloud native community came together with hope and positivity to make this a truly community-driven event we will remember for a long time. We may not have been able to meet in person this year but we are indomitable!” Read the recap blog here.

CNCFスタッフによるKubeCon + CloudNativeCon Virtualのふりかえり記事。

ICYMI: CNCF Webinars

You can view all CNCF recorded and upcoming webinars here.

CNCF Member Webinar: Modern Software Development Pipeline: A Security Reference Architecture

Vinay Venkataraghavan, Cloud CTO, Prisma Cloud @Palo Alto Networks

以下のポイントで解説している。
1. 典型的な展開パイプラインと軽減すべき脅威について調査
2. セキュリティコントロールを埋め込むためのリファレンスアーキテクチャを提案
3. ソフトウェアのデリバリーライフサイクル全体に組み込むことができるセキュリティツールのいくつかの実践的な例示

CNCF Member Webinar: MLOps automation with Git Based CI/CD for ML

Yaron Haviv, Co-Founder and CTO @Iguazio

MLパイプラインのしくみ、主な課題、およびモデルとデータプロダクトの作成に伴うさまざまな手順(データ収集、準備、トレーニング/AutoML、検証、モデルのデプロイメント、ドリフトモニタリングなど)について説明している。
開発とデプロイメントのプロセスを大幅に簡略化および自動化する以下の方法を示している。
1. Maximize the efficiency and collaboration between the various teams
2. Harness Git review processes to evaluate models
3. Abstract away the complexity of Kubernetes and DevOps.

CNCF Member Webinar: Local Development in The Age of Kubernetes

Misha Gusarov, Software Architect @Ridge Cloud

アプリケーションの開発とデバッグをできる限り簡単にする方法で、対話性を取り戻す方法を解説している。Kubernetesの構成要素を調べ、ローカル開発環境でそれらの機能を再作成する方法を取っている。

CNCF Member Webinar: How to migrate databases into Kubernetes?

Alex Chircop, CEO & Founder @StorageOS and Ferran Castell, Product Reliability Engineer @StorageOS

以下の方法を解説している。Kubernetes上にデータベースの様なステートフルなワークロードを移行しようと考えている方向け。
- How to deploy databases in production in Kubernetes
- How to implement automatic failover with high availability
- How to migrate a database into a Kubernetes cluster
- How to build a database as a service with Kubernetes

Guys!!!!
My first major open source project got merged today!!

I am over excited!!!

It took me months to build this project but with perseverance and hard work, I completed it!

To my mentors @bwplotka and Lucas, and the entire @ThanosMetrics community, I love you 💜!! pic.twitter.com/HXAC4ceObb
— Uchechukwu Obasi (@Thisisobate) 2020年8月24日

The Technical

Tutorials, tools, and more that take you on a deep dive into the code.

Introducing Hierarchical Namespaces

Adrian Ludwin, Google

Kubernetes.ioの「Hierarchical Namespaces」の紹介記事。Kubernetes Working Group for Multi-Tenancy (wg-multitenancy)が開発した新しいコンセプト。
Namespaceを跨いだOwnershipのコンセプトにより、以下2つの挙動と追加している。ポリシーの継承と、リソースの委譲作成。
1. Policy inheritance: if one namespace is a child of another, policy objects such as RBAC RoleBindings are copied from the parent to the child.
2. Delegated creation: you usually need cluster-level privileges to create a namespace, but hierarchical namespaces adds an alternative: subnamespaces, which can be manipulated using only limited permissions in the parent namespace.

Moving Forward From Beta

Tim Bannister, The Scale Factory

KubernetesのLifecycleのbetaに焦点を当てて解説している記事。
Kubernetes Enhancement Proposal (KEP)からのコード化、ライフサイクルのalpha→beta→stable(generally availability)の流れなどに触れている。

Design Considerations at the Edge of the ServiceMesh

Raffaele Spazzoli, Trevor Box, and Joshua Mathianas at Red Hat

メッシュとの間の送受信トラフィックに関する一連の設計パターンを紹介している記事。

Zero-Downtime Kubernetes Deployments

Oliver Leaver-Smith, Sky Betting & Gaming

Core Customer社が過去数か月にわたって、戦術的なコンテナプラットフォームからオンプレミスのKubernetesクラスタにOIDC / OAuth2 IDサービスを移行するために行った作業から、Kubernetesをダウンタイム無しでデプロイする方法を解説している。

Google chooses Cilium for GKE networking

Thomas Graf, Isovalent

Cilium.ioの記事。GKEのDataplane V2がCiliumとeBPFを利用するとのGCPの発表を受け、この結果に繋がった舞台裏を解説している。

ArgoCD and Tekton: Match made in Kubernetes heaven

Burr Sutter and Siamak Sadeghianfar, Red Hat

Red Hat社OpenshiftチームによるTwitch上のWebinar動画。
自身で動かしたいもの作ってCI/CDをガッツリ試したいな。

An introduction to installing Prometheus with Minikube

Shashank Nandishwar Hegde, Red Hat

Prometheusの基本的な概念とminikubeへのインストールについて説明している。次の記事で、アプリケーションの監視について解説する模様。

How To Manage Your Kubernetes Configurations with Kustomize

DigitalOcean

以下の３つのポイントでKustomizeを利用したKubernetesの構成管理方法を解説している。
1. 小さなWebアプリケーションを構築し、Kustomizeを使用して構成の無秩序な増加の管理
2. 異なる構成の開発環境と本番環境にアプリをデプロイ
3. Kustomizeのベースとオーバーレイを使用して、これらの変数構成をレイヤー化し、コードが読みやすくなり、保守が容易になるように

The Editorial

Articles, announcements, and morethatgive you a high-level overview of challenges and features.

Terrascan Leverages OPA to Make Policy as Code Extensible

Cesar Rodriguez, Accurics

クラウドネイティブインフラストラクチャーをプロビジョニングする前に、インフラストラクチャー全体のコンプライアンスおよびセキュリティ違反をコードとして検出してリスクを軽減するOSSツールの「Terrascan」の歴史、正規表現ベースのルールからOPAエンジンにリプレイスしたことなどを解説している。
TerrascanのGitHubページはこちら。ドキュメントはこちら。

Kubernetes engineers keep your favorite software running

Megan Friedman, The Keyword

Google社のブログ「The Keyword」にGKEの5周年を記念して、3人のGKE、Kubernetesにコントリビュートしているエンジニア(Michelle Au, Janet Kuo and Purvi Desai)の3名にインタビューしている記事。
GKE、Kubernetes、好きな顧客事例、これからこの世界に入ってくる開発者向けのアドバイスなどについて、それぞれの意見を載せている。

Looking ahead as GKE, the original managed Kubernetes, turns 5

Chen Goldberg and Drew Bradstock, Google Cloud

GCPの記事。GKEが5周年を迎えたことと、Virtual KubeConの開始に合わせて、Kubernetesをそのようなものにし、コンテナ化されたアプリケーションを管理するための業界標準にしてくれたコミュニティに感謝を伝えている。
今後の展望として、GKEをKubernetesを実行するのに最適な場所にするための取り組みを続ける以下5つの方法を共有している。
1. Leaving no app behind
2. Saving money with optimal price-to-performance by default
3. Container-native networking: no more square pegs in round holes
4. Bringing BeyondProd to containerized apps
5. Democratizing access to learning Kubernetes

KubeCon EU: Accurics, Snyk Release Tools to Secure Infrastructure-as-Code Deployments

Joab Jackson, The New Stack

The New Stack社の記事。上で取り上げているAccurics社のOSS 「Terrascan」と、Snyk社のSnyk IaCがKubeCon EUに合わせてリリースされたことを解説している。

Use Virtual Clusters to Tame Sprawl in Kubernetes

Emily Omier, Nirmata

Nirmata社の記事。「より多くの組織がKubernetesを採用し、ベストプラクティスの実施と、クラスターの無秩序な増加に関連する管理およびリソース使用率の問題に苦しみ始めているため、他のリソースと同じ仮想化手法をクラスターに適用し始めている」として、Virtual Clusterを使った手法を解説し、自社サービスの案内に繋げている。

Complexity: Your Day 2 Enemy

Emily Omier, Nirmata

続けてNirmata社の記事。Kubernetesの複雑性について解説している。
「組織は、クラスター全体で一貫した構成と一貫したアプリケーション設計を確保すると同時に、開発者とオペレーターのエクスペリエンスを簡素化するツールを使用して、Kubernetes固有の複雑さを最小限に抑えることに焦点を当てる必要がある」として、「組織が導入段階と運用段階の両方で複雑さを和らげるのに役立つ」自社サービスの案内に繋げている。

JaegerTracing announces v1.19 release

jaegarのv.19のGitHub上のリリースページ。

Simplify Edge Networking for Different Kubernetes Providers

Noah Krause, ITNext

Ambassador Labs社のK8s Initializerの紹介記事。新しいKubernetesクラスタのブートストラップネットワーキング、Ingress、CI / CD、可観測性を提供するツール。

I wrote up some thoughts on last week's KubeCon, including some talks I think you should watch. Check it out: https://t.co/DXifnD8eed
— Rich Burroughs (@richburroughs) 2020年8月26日

Upcoming CNCF webinars

気になるWebinarがあれば登録してチェックを。以下は直近のものとしてリストされていたものです。

Member Webinar: Let’s Untangle The Service Mesh
Dominik Tornow, Principal Engineer @Cisco
Sept 1, 2020 10:00 AM Pacific Time
REGISTER NOW »

Member Webinar: CNCF has 99+ K8S distros, and this is how (and why) we built one more: OKD4 on FCOS
Christian Glombek, Vadim Rutkovsky, Charro Gruver and Dusty Mabe @Red Hat
Sept 2, 2020 7:00 AM Pacific Time
REGISTER NOW »

Member Webinar: Getting started with container runtime security using Falco
Loris Degioanni, CTO and Founder @Sysdig
Sept 2, 2020 1:00 PM Pacific Time
REGISTER NOW »

Member Webinar: Running the next generation of cloud-native applications using Open Application Model (OAM)
Ryan Zhang, Staff Software Engineer @Alibaba Cloud
Sept 3, 2020 10:00 AM Pacific Time
REGISTER NOW »

Member Webinar: Arm Developer Experience Spanning Cloud, 5G and IoT
Darragh Grealish, Co-Founder @56K.Cloud
Marc Meunier, Sr. Manager, SW Ecosystem Development @Arm
Sept 8, 2020 10:00 AM Pacific Time
REGISTER NOW »

Member Webinar: Building a Cloud-Native Technology Stack that Supports Full Cycle Development
Daniel Bryant, Product Architect @Datawire
Sept 9, 2020 7:00 AM Pacific Time
REGISTER NOW »

Member Webinar: Highly scalable SaaS Apps on Kubernetes: Real Life Case Studies
Ram Kailasanathan, Senior Director Product Management @Oracle
Sept 9, 2020 1:00 PM Pacific Time
REGISTER NOW »

Member Webinar: Kubernetes and Networks: why is this so dang hard?
Tim Hockin, Principal Software Engineer @Google
Sept 10, 2020 10:00 AM Pacific Time
REGISTER NOW »

Member Webinar: Achieving Least Privilege Access in Kubernetes
Eran Leib Co-Founder and VP Product Management @Apolicy
Gregg Ogden Senior Product Marketing Manager @Aqua Security
Sept 11, 2020 10:00 AM Pacific Time
REGISTER NOW »

Ambassador Webinar: Hybrid Serverless Development using Quarkus and Kubernetes
Daniel Oh, Principal Technical Marketing Manager @RedHat and CNCF Ambassador
Sept 11, 2020 1:00 PM Pacific Time
REGISTER NOW »

Member Webinar: ChubaoFS Best Practices
Wei Ding, Staff Engineer @JD.com
Sept 15, 2020 10:00 AM Pacific Time
REGISTER NOW »

Member Webinar: How To Run Kubernetes Securely and Efficiently
Joe Pelletier, VP, Products Fairwinds @Fairwinds
Robert Brennan, Director, Open Source @Fairwinds
Sept 16, 2020 7:00 AM Pacific Time
REGISTER NOW »

Member Webinar: Effective Kubernetes Onboarding
Kathleen Juell, Developer, DODX @DigitalOcean
Sept 16, 2020 1:00 PM Pacific Time
REGISTER NOW »

いかがでしたか？気になる記事や情報はありましたか？

私もまだ内容を咀嚼出来ていないものが多々ありますので、この備忘録兼リンク集を活用しながら理解を深めていきたいと思います。

では、また。

Bye now!!

Yoshiki Fujiwara