#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: openstack-helm 0.1.1.dev4021\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2023-10-27 22:03+0000\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: ../../source/testing/ceph-node-resiliency.rst:3
msgid "Ceph - Node Reduction, Expansion and Ceph Recovery"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:5
msgid ""
"This document captures steps and result from node reduction and expansion as "
"well as ceph recovery."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:9
msgid "Test Scenarios:"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:10
msgid ""
"1) Node reduction: Shutdown 1 of 3 nodes to simulate node failure. Capture "
"effect of node failure on Ceph as well as other OpenStack services that are "
"using Ceph."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:13
msgid ""
"2) Node expansion: Apply Ceph and OpenStack related labels to another unused "
"k8 node. Node expansion should provide more resources for k8 to schedule "
"PODs for Ceph and OpenStack services."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:17
msgid ""
"3) Fix Ceph Cluster: After node expansion, perform maintenance on Ceph "
"cluster to ensure quorum is reached and Ceph is HEALTH_OK."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:21
#: ../../source/testing/ceph-upgrade.rst:14
msgid "Setup:"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:22
msgid "6 Nodes (VM based) env"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:23
msgid ""
"Only 3 nodes will have Ceph and OpenStack related labels. Each of these 3 "
"nodes will have one MON and one OSD running on them."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:25
#: ../../source/testing/ceph-upgrade.rst:16
msgid ""
"Followed OSH multinode guide steps to setup nodes and install K8s cluster"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:26
msgid ""
"Followed OSH multinode guide steps to install Ceph and OpenStack charts up "
"to Cinder."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:30
#: ../../source/testing/ceph-upgrade.rst:50
msgid "Steps:"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:31
msgid ""
"1) Initial Ceph and OpenStack deployment: Install Ceph and OpenStack charts "
"on 3 nodes (mnode1, mnode2 and mnode3). Capture Ceph cluster status as well "
"as K8s PODs status."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:35
msgid ""
"2) Node reduction (failure): Shutdown 1 of 3 nodes (mnode3) to test node "
"failure. This should cause Ceph cluster to go in HEALTH_WARN state as it has "
"lost 1 MON and 1 OSD. Capture Ceph cluster status as well as K8s PODs status."
""
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:40
msgid ""
"3) Node expansion: Add Ceph and OpenStack related labels to 4th node "
"(mnode4) for expansion. Ceph cluster would show new MON and OSD being added "
"to cluster. However Ceph cluster would continue to show HEALTH_WARN because "
"1 MON and 1 OSD are still missing."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:46
msgid ""
"4) Ceph cluster recovery: Perform Ceph maintenance to make Ceph cluster "
"HEALTH_OK. Remove lost MON and OSD from Ceph cluster."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:52
msgid "Step 1: Initial Ceph and OpenStack deployment"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:55
msgid ""
"Make sure only 3 nodes (mnode1, mnode2, mnode3) have Ceph and OpenStack "
"related labels. K8s would only schedule PODs on these 3 nodes."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:58
#: ../../source/testing/ceph-node-resiliency.rst:357
#: ../../source/testing/ceph-node-resiliency.rst:632
msgid "``Ceph status:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:83
#: ../../source/testing/ceph-node-resiliency.rst:441
msgid "``Ceph MON Status:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:187
#: ../../source/testing/ceph-node-resiliency.rst:386
#: ../../source/testing/ceph-node-resiliency.rst:768
msgid "``Ceph quorum status:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:244
#: ../../source/testing/ceph-node-resiliency.rst:543
#: ../../source/testing/ceph-node-resiliency.rst:831
msgid "``Ceph PODs:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:273
#: ../../source/testing/ceph-node-resiliency.rst:576
#: ../../source/testing/ceph-node-resiliency.rst:867
msgid "``OpenStack PODs:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:300
#: ../../source/testing/ceph-node-resiliency.rst:611
#: ../../source/testing/ceph-node-resiliency.rst:902
msgid "``Result/Observation:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:302
msgid "Ceph cluster is in HEALTH_OK state with 3 MONs and 3 OSDs."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:303
msgid "All PODs are in running state."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:307
msgid "Step 2: Node reduction (failure):"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:309
msgid ""
"Shutdown 1 of 3 nodes (mnode1, mnode2, mnode3) to simulate node failure/lost."
""
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:311
msgid "In this test env, let's shutdown ``mnode3`` node."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:313
msgid "``Following are PODs scheduled on mnode3 before shutdown:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:336
msgid ""
"In this test env, MariaDB chart is deployed with only 1 replica. In order to "
"test properly, the node with MariaDB server POD (mnode2) should not be "
"shutdown."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:340
msgid ""
"In this test env, each node has Ceph and OpenStack related PODs. Due to "
"this, shutting down a Node will cause issue with Ceph as well as OpenStack "
"services. These PODs level failures are captured following subsequent "
"screenshots."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:344
msgid "``Check node status:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:613
msgid ""
"PODs that were scheduled on mnode3 node has status of NodeLost/Unknown."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:614
msgid "Ceph status shows HEALTH_WARN as expected"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:615
msgid "Ceph status shows 1 Ceph MON and 1 Ceph OSD missing."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:616
msgid "OpenStack PODs that were scheduled mnode3 also shows NodeLost/Unknown."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:619
msgid "Step 3: Node Expansion"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:621
msgid "Let's add more resources for K8s to schedule PODs on."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:623
msgid ""
"In this test env, let's use ``mnode4`` and apply Ceph and OpenStack related "
"labels."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:627
msgid ""
"Since the node that was shutdown earlier had both Ceph and OpenStack PODs, "
"mnode4 should get Ceph and OpenStack related labels as well."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:630
msgid "After applying labels, let's check status"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:656
#: ../../source/testing/ceph-node-resiliency.rst:959
msgid "``Ceph MON Status``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:904
msgid "Ceph MON and OSD PODs got scheduled on mnode4 node."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:905
msgid "Ceph status shows that MON and OSD count has been increased."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:906
msgid "Ceph status still shows HEALTH_WARN as one MON and OSD are still down."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:909
msgid "Step 4: Ceph cluster recovery"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:911
msgid ""
"Now that we have added new node for Ceph and OpenStack PODs, let's perform "
"maintenance on Ceph cluster."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:915
msgid "1) Remove out of quorum MON:"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:917
msgid ""
"Using ``ceph mon_status`` and ``ceph -s`` commands, confirm ID of MON that "
"is out of quorum."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:919
msgid "In this test env, ``mnode3`` is out of quorum."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:922
msgid ""
"In this test env, since out of quorum MON is no longer available due to node "
"failure, we can processed with removing it from Ceph cluster."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:925
msgid "``Remove MON from Ceph cluster``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:932
msgid "``Ceph Status:``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:957
msgid ""
"As shown above, Ceph status is now HEALTH_OK and shows 3 MONs available."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1043
msgid "``Ceph quorum status``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1102
msgid "2) Remove down OSD from Ceph cluster:"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1104
msgid ""
"As shown in Ceph status above, ``osd: 4 osds: 3 up, 3 in`` 1 of 4 OSDs is "
"still down. Let's remove that OSD."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1107
msgid "First, run ``ceph osd tree`` command to get list of OSDs."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1123
msgid "Above output shows that ``osd.1`` is down."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1125
msgid "Run ``ceph osd purge`` command to remove OSD from ceph cluster."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1132
msgid "``Ceph status``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1157
msgid ""
"Above output shows Ceph cluster in HEALTH_OK with all OSDs and MONs up and "
"running."
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1159
msgid "``Ceph PODs``"
msgstr ""

#: ../../source/testing/ceph-node-resiliency.rst:1195
msgid "``OpenStack PODs``"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:3
msgid "Resiliency Tests for OpenStack-Helm/Ceph"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:6
msgid "Mission"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:8
msgid ""
"The goal of our resiliency tests for `OpenStack-Helm/Ceph <https://github."
"com/openstack/openstack-helm/tree/master/ceph>`_ is to show symptoms of "
"software/hardware failure and provide the solutions."
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:13
msgid ""
"Our focus lies on resiliency for various failure scenarios but not on "
"performance or stress testing."
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:14
msgid "Caveats:"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:17
msgid "Software Failure"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:18
msgid "`Monitor failure <./monitor-failure.html>`_"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:19
msgid "`OSD failure <./osd-failure.html>`_"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:22
msgid "Hardware Failure"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:23
msgid "`Disk failure <./disk-failure.html>`_"
msgstr ""

#: ../../source/testing/ceph-resiliency/README.rst:24
msgid "`Host failure <./host-failure.html>`_"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:3
msgid "Disk Failure"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:6
#: ../../source/testing/ceph-resiliency/host-failure.rst:6
#: ../../source/testing/ceph-resiliency/monitor-failure.rst:6
#: ../../source/testing/ceph-resiliency/osd-failure.rst:6
msgid "Test Environment"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:8
#: ../../source/testing/ceph-resiliency/host-failure.rst:8
#: ../../source/testing/ceph-resiliency/monitor-failure.rst:8
#: ../../source/testing/ceph-resiliency/osd-failure.rst:8
msgid "Cluster size: 4 host machines"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:9
#: ../../source/testing/ceph-resiliency/host-failure.rst:9
#: ../../source/testing/ceph-resiliency/monitor-failure.rst:9
#: ../../source/testing/ceph-resiliency/osd-failure.rst:9
msgid "Number of disks: 24 (= 6 disks per host * 4 hosts)"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:10
#: ../../source/testing/ceph-resiliency/host-failure.rst:10
msgid "Kubernetes version: 1.10.5"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:11
#: ../../source/testing/ceph-resiliency/host-failure.rst:11
#: ../../source/testing/ceph-resiliency/monitor-failure.rst:11
#: ../../source/testing/ceph-resiliency/osd-failure.rst:11
msgid "Ceph version: 12.2.3"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:12
#: ../../source/testing/ceph-resiliency/host-failure.rst:12
msgid "OpenStack-Helm commit: 25e50a34c66d5db7604746f4d2e12acbdd6c1459"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:15
msgid "Case: A disk fails"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:18
#: ../../source/testing/ceph-resiliency/host-failure.rst:18
#: ../../source/testing/ceph-resiliency/host-failure.rst:106
#: ../../source/testing/ceph-resiliency/monitor-failure.rst:134
msgid "Symptom:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:20
msgid ""
"This is to test a scenario when a disk failure happens. We monitor the ceph "
"status and notice one OSD (osd.2) on voyager4 which has ``/dev/sdh`` as a "
"backend is down."
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:82
msgid "Solution:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:84
msgid "To replace the failed OSD, execute the following procedure:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:86
msgid ""
"From the Kubernetes cluster, remove the failed OSD pod, which is running on "
"``voyager4``:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:94
msgid ""
"Note: To find the daemonset associated with a failed OSD, check out the "
"followings:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:103
msgid ""
"Remove the failed OSD (OSD ID = 2 in this example) from the Ceph cluster:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:112
msgid "Find that Ceph is healthy with a lost OSD (i.e., a total of 23 OSDs):"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:136
msgid ""
"4. Replace the failed disk with a new one. If you repair (not replace) the "
"failed disk, you may need to run the following:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:143
msgid "Start a new OSD pod on ``voyager4``:"
msgstr ""

#: ../../source/testing/ceph-resiliency/disk-failure.rst:149
msgid ""
"Validate the Ceph status (i.e., one OSD is added, so the total number of "
"OSDs becomes 24):"
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:3
msgid "Host Failure"
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:15
msgid "Case: One host machine where ceph-mon is running is rebooted"
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:20
msgid "After reboot (node voyager3), the node status changes to ``NotReady``."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:31
msgid ""
"Ceph status shows that ceph-mon running on ``voyager3`` becomes out of "
"quorum. Also, six osds running on ``voyager3`` are down; i.e., 18 osds are "
"up out of 24 osds."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:64
#: ../../source/testing/ceph-resiliency/host-failure.rst:195
#: ../../source/testing/ceph-resiliency/monitor-failure.rst:179
msgid "Recovery:"
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:65
msgid ""
"The node status of ``voyager3`` changes to ``Ready`` after the node is up "
"again. Also, Ceph pods are restarted automatically. Ceph status shows that "
"the monitor running on ``voyager3`` is now in quorum."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:101
msgid "Case: A host machine where ceph-mon is running is down"
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:103
msgid ""
"This is for the case when a host machine (where ceph-mon is running) is down."
""
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:108
msgid ""
"After the host is down (node voyager3), the node status changes to "
"``NotReady``."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:119
msgid ""
"Ceph status shows that ceph-mon running on ``voyager3`` becomes out of "
"quorum. Also, 6 osds running on ``voyager3`` are down (i.e., 18 out of 24 "
"osds are up). Some placement groups become degraded and undersized."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:154
msgid "The pod status of ceph-mon and ceph-osd shows as ``NodeLost``."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:168
msgid ""
"After 10+ miniutes, Ceph starts rebalancing with one node lost (i.e., 6 osds "
"down) and the status stablizes with 18 osds."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:197
msgid ""
"The node status of ``voyager3`` changes to ``Ready`` after the node is up "
"again. Also, Ceph pods are restarted automatically. The Ceph status shows "
"that the monitor running on ``voyager3`` is now in quorum and 6 osds gets "
"back up (i.e., a total of 24 osds are up)."
msgstr ""

#: ../../source/testing/ceph-resiliency/host-failure.rst:224
msgid ""
"Also, the pod status of ceph-mon and ceph-osd changes from ``NodeLost`` back "
"to ``Running``."
msgstr ""

#: ../../source/testing/ceph-resiliency/index.rst:3
msgid "Ceph Resiliency"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:3
msgid "Monitor Failure"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:10
#: ../../source/testing/ceph-resiliency/osd-failure.rst:10
msgid "Kubernetes version: 1.9.3"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:12
#: ../../source/testing/ceph-resiliency/osd-failure.rst:12
msgid "OpenStack-Helm commit: 28734352741bae228a4ea4f40bcacc33764221eb"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:14
msgid ""
"We have 3 Monitors in this Ceph cluster, one on each of the 3 Monitor hosts."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:18
msgid "Case: 1 out of 3 Monitor Processes is Down"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:20
msgid "This is to test a scenario when 1 out of 3 Monitor processes is down."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:22
msgid ""
"To bring down 1 Monitor process (out of 3), we identify a Monitor process "
"and kill it from the monitor host (not a pod)."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:31
msgid ""
"In the mean time, we monitored the status of Ceph and noted that it takes "
"about 24 seconds for the killed Monitor process to recover from ``down`` to "
"``up``. The reason is that Kubernetes automatically restarts pods whenever "
"they are killed."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:64
msgid ""
"We also monitored the status of the Monitor pod through ``kubectl get pods -"
"n ceph``, and the status of the pod (where a Monitor process is killed) "
"changed as follows: ``Running`` -> ``Error`` -> ``Running`` and this "
"recovery process takes about 24 seconds."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:70
msgid "Case: 2 out of 3 Monitor Processes are Down"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:72
msgid ""
"This is to test a scenario when 2 out of 3 Monitor processes are down. To "
"bring down 2 Monitor processes (out of 3), we identify two Monitor processes "
"and kill them from the 2 monitor hosts (not a pod)."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:76
msgid ""
"We monitored the status of Ceph when the Monitor processes are killed and "
"noted that the symptoms are similar to when 1 Monitor process is killed:"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:80
msgid ""
"It takes longer (about 1 minute) for the killed Monitor processes to recover "
"from ``down`` to ``up``."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:83
msgid ""
"The status of the pods (where the two Monitor processes are killed) changed "
"as follows: ``Running`` -> ``Error`` -> ``CrashLoopBackOff`` -> ``Running`` "
"and this recovery process takes about 1 minute."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:89
msgid "Case: 3 out of 3 Monitor Processes are Down"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:91
msgid ""
"This is to test a scenario when 3 out of 3 Monitor processes are down. To "
"bring down 3 Monitor processes (out of 3), we identify all 3 Monitor "
"processes and kill them from the 3 monitor hosts (not pods)."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:95
msgid ""
"We monitored the status of Ceph Monitor pods and noted that the symptoms are "
"similar to when 1 or 2 Monitor processes are killed:"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:123
msgid ""
"The status of the pods (where the three Monitor processes are killed) "
"changed as follows: ``Running`` -> ``Error`` -> ``CrashLoopBackOff`` -> "
"``Running`` and this recovery process takes about 1 minute."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:128
msgid "Case: Monitor database is destroyed"
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:130
msgid ""
"We intentionlly destroy a Monitor database by removing ``/var/lib/openstack-"
"helm/ceph/mon/mon/ceph-voyager3/store.db``."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:136
msgid ""
"A Ceph Monitor running on voyager3 (whose Monitor database is destroyed) "
"becomes out of quorum, and the mon-pod's status stays in ``Running`` -> "
"``Error`` -> ``CrashLoopBackOff`` while keeps restarting."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:169
msgid ""
"The logs of the failed mon-pod shows the ceph-mon process cannot run as ``/"
"var/lib/ceph/mon/ceph-voyager3/store.db`` does not exist."
msgstr ""

#: ../../source/testing/ceph-resiliency/monitor-failure.rst:181
msgid ""
"Remove the entire ceph-mon directory on voyager3, and then Ceph will "
"automatically recreate the database by using the other ceph-mons' database."
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:3
msgid "OSD Failure"
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:15
msgid "Case: OSD processes are killed"
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:17
msgid "This is to test a scenario when some of the OSDs are down."
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:19
msgid ""
"To bring down 6 OSDs (out of 24), we identify the OSD processes and kill "
"them from a storage host (not a pod)."
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:52
msgid ""
"In the mean time, we monitor the status of Ceph and noted that it takes "
"about 30 seconds for the 6 OSDs to recover from ``down`` to ``up``. The "
"reason is that Kubernetes automatically restarts OSD pods whenever they are "
"killed."
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:69
msgid "Case: A OSD pod is deleted"
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:71
msgid ""
"This is to test a scenario when an OSD pod is deleted by ``kubectl delete "
"$OSD_POD_NAME``. Meanwhile, we monitor the status of Ceph and note that it "
"takes about 90 seconds for the OSD running in deleted pod to recover from "
"``down`` to ``up``."
msgstr ""

#: ../../source/testing/ceph-resiliency/osd-failure.rst:102
msgid ""
"We also monitored the pod status through ``kubectl get pods -n ceph`` during "
"this process. The deleted OSD pod status changed as follows: ``Terminating`` "
"-> ``Init:1/3`` -> ``Init:2/3`` -> ``Init:3/3`` -> ``Running``, and this "
"process takes about 90 seconds. The reason is that Kubernetes automatically "
"restarts OSD pods whenever they are deleted."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:3
msgid "Ceph Upgrade"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:5
msgid ""
"This guide documents steps showing Ceph version upgrade. The main goal of "
"this document is to demostrate Ceph chart update without downtime for OSH "
"components."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:9
msgid "Test Scenario:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:10
msgid ""
"Upgrade Ceph component version from ``12.2.4`` to ``12.2.5`` without "
"downtime to OSH components."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:15
msgid "3 Node (VM based) env."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:17
msgid "Followed OSH multinode guide steps upto Ceph install"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:20
msgid "Plan:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:21
msgid "Install Ceph charts (12.2.4) by updating Docker images in overrides."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:22
msgid "Install OSH components as per OSH multinode guide."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:23
msgid ""
"Upgrade Ceph charts to version 12.2.5 by updating docker images in overrides."
""
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:27
msgid "Docker Images:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:28
msgid "Ceph Luminous point release images for Ceph components"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:35
msgid "Ceph RBD provisioner docker images."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:42
msgid "Ceph Cephfs provisioner docker images."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:53
msgid "Follow all steps from OSH multinode guide with below changes."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:55
msgid "Install Ceph charts (version 12.2.4)"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:58
msgid ""
"Update ceph install script ``./tools/deployment/multinode/030-ceph.sh`` to "
"add ``images:`` section in overrides as shown below."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:62
msgid "OSD count is set to 3 based on env setup."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:65
msgid "Following is a partial part from script to show changes."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:104
msgid ""
"``ceph_bootstrap``, ``ceph-config_helper`` and ``ceph_rbs_pool`` images are "
"used for jobs. ``ceph_mon_check`` has one script that is stable so no need "
"to upgrade."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:108
msgid "Deploy and Validate Ceph"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:130
msgid "Check Ceph Pods"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:168
msgid "Check version of each Ceph components."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:187
#: ../../source/testing/ceph-upgrade.rst:550
msgid "Check which images Provisionors and Mon-Check PODs are using"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:190
msgid ""
"Showing partial output from kubectl describe command to show which image is "
"Docker container is using"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:221
msgid "Install Openstack charts"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:223
msgid "Continue with OSH multinode guide to install other Openstack charts."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:225
msgid "Capture Ceph pods statuses."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:261
msgid "Capture Openstack pods statuses."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:315
msgid "Upgrade Ceph charts to update version"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:317
msgid ""
"Use Ceph override file ``ceph.yaml`` that was generated previously and "
"update images section as below"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:320
msgid "``cp /tmp/ceph.yaml ceph-update.yaml``"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:322
msgid ""
"Update, image section in new overrides ``ceph-update.yaml`` as shown below"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:341
msgid "Update Ceph Mon chart with new overrides"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:344
msgid "``helm upgrade ceph-mon ./ceph-mon --values=ceph-update.yaml``"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:346
#: ../../source/testing/ceph-upgrade.rst:375
msgid "``series of console outputs:``"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:365
msgid ""
"``Results:`` Mon pods got updated one by one (rolling updates). Each Mon pod "
"got respawn and was in 1/1 running state before next Mon pod got updated. "
"Each Mon pod got restarted. Other ceph pods were not affected with this "
"update. No interruption to OSH pods."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:371
msgid "Update Ceph OSD chart with new overrides:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:373
msgid "``helm upgrade ceph-osd ./ceph-osd --values=ceph-update.yaml``"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:391
#: ../../source/testing/ceph-upgrade.rst:419
msgid ""
"``Results:`` Rolling updates (one pod at a time). Other ceph pods are "
"running. No interruption to OSH pods."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:395
msgid "Update Ceph Client chart with new overrides:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:397
msgid "``helm upgrade ceph-client ./ceph-client --values=ceph-update.yaml``"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:422
msgid "Update Ceph Provisioners chart with new overrides:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:424
msgid ""
"``helm upgrade ceph-provisioners ./ceph-provisioners --values=ceph-update."
"yaml``"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:441
msgid ""
"``Results:`` All provisioner pods got terminated at once (same time). Other "
"ceph pods are running. No interruption to OSH pods."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:444
msgid "Capture final Ceph pod statuses:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:479
msgid "Capture final Openstack pod statuses:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:531
msgid "Confirm Ceph component's version."
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:580
msgid "Conclusion:"
msgstr ""

#: ../../source/testing/ceph-upgrade.rst:581
msgid ""
"Ceph can be upgraded without downtime for Openstack components in a "
"multinode env."
msgstr ""

#: ../../source/testing/helm-tests.rst:3
msgid "Helm Tests"
msgstr ""

#: ../../source/testing/helm-tests.rst:5
msgid ""
"Every OpenStack-Helm chart should include any required Helm tests necessary "
"to provide a sanity check for the OpenStack service.  Information on using "
"the Helm testing framework can be found in the Helm repository_.  Currently, "
"the Rally testing framework is used to provide these checks for the core "
"services.  The Keystone Helm test template can be used as a reference, and "
"can be found here_."
msgstr ""

#: ../../source/testing/helm-tests.rst:17
msgid "Testing Expectations"
msgstr ""

#: ../../source/testing/helm-tests.rst:19
msgid ""
"Any templates for Helm tests submitted should follow the philosophies "
"applied in the other templates.  These include: use of overrides where "
"appropriate, use of endpoint lookups and other common functionality in helm-"
"toolkit, and mounting any required scripting templates via the configmap-bin "
"template for the service chart.  If Rally tests are not appropriate or "
"adequate for a service chart, any additional tests should be documented "
"appropriately and adhere to the same expectations."
msgstr ""

#: ../../source/testing/helm-tests.rst:28
msgid "Running Tests"
msgstr ""

#: ../../source/testing/helm-tests.rst:30
msgid "Any Helm tests associated with a chart can be run by executing:"
msgstr ""

#: ../../source/testing/helm-tests.rst:36
msgid ""
"The output of the Helm tests can be seen by looking at the logs of the pod "
"created by the Helm tests.  These logs can be viewed with:"
msgstr ""

#: ../../source/testing/helm-tests.rst:43
msgid ""
"Additional information on Helm tests for OpenStack-Helm and how to execute "
"these tests locally via the scripts used in the gate can be found in the "
"gates_ directory."
msgstr ""

#: ../../source/testing/helm-tests.rst:51
msgid "Adding Tests"
msgstr ""

#: ../../source/testing/helm-tests.rst:53
msgid ""
"All tests should be added to the gates during development, and are required "
"for any new service charts prior to merging.  All Helm tests should be "
"included as part of the deployment script.  An example of this can be seen "
"in this script_."
msgstr ""

#: ../../source/testing/index.rst:3
msgid "Testing"
msgstr ""