SUSE CaaS Platform/Issues

Fonte: https://wiki.microfocus.com/index.php/SUSE_CaaS_Platform/Issues

This site is for documentation purposes only. If you ran into any problems with your subscribed SUSE CaaS Platform please contact the SUSE support!

This site lists workaround for issues which needs to be fixed within SUSE CaaS Platform.

Outdated Certificates

This workaround is tested, however, if you have a problem please open a service request! Fixing the bug yourself at your own risk.

It sometimes might happen that a certificate gets outdated and is not renewed properly. To fix this issue do the following steps: SSH on to the Admin Node and move the expired certs out of the way:

  mv /etc/pki/{velum,ldap,salt-api}.crt /root

2) Regenerate the set of certs moved in step 1:

  cd /etc/pki
  /usr/share/caasp-container-manifests/gen-certs.sh

if you want to regenerate additional certificates i.e.: /etc/pki/kubectl-client-cert.crt that are not rebuilt by this script, add an additional line at the end of the script:

 transactional-update shell
 transactional update # echo "gencert \"kubectl-client-cert\" \"kubectl-client-cert\" \"\$all_hostnames\" \"\$(ip_addresses)\"" >>/usr/share/caasp-container-manifests/gen-certs.sh
 transactional update # /usr/share/caasp-container-manifests/gen-certs.sh
 transactional update # exit 


3) On a master node, backup and delete the dex-tls secret:

  kubectl -n kube-system get secret dex-tls -o yaml > /root/dex-tls
  kubectl -n kube-system delete secret dex-tls

4) On a master node, find and delete the dex pods (bsc#1082996):

This *will* prevent new authentications requests succeeding against the cluster. However, the static credentials located on the master nodes will continue to function.

  kubectl -n kube-system get pods | grep dex
  kubectl -n kube-system delete pods <Dex Pod 1> <Dex Pod 3> <Dex Pod 3>

They will *NOT* start back up by themselves until the dex-tls secret is recreated as part of step 5.

5) Manually run the salt orchestration on the admin node, this may take some time:

  docker exec -it $(docker ps | grep salt-master | awk '{print $1}') bash -c "salt-run state.orchestrate orch.kubernetes" 2&>1 > salt-run.log

6) Check the tail of salt-run.log to see if the orchestration succeeded

  tail -n 50 salt-run.log

7) On a master node, validate the dex pods are running:

  kubectl -n kube-system get pods | grep dex

8) In many cases, you won't be able to login into Velum after the change. If this is the case, reboot the admin node and test and validate the cluster is still functional

Cluster Scaling

Without changes in the configuration the maximum number of nodes in a cluster can be 40. The Salt Master configuration needs to be adjusted to handle install and update of larger cluster:

  • Above 40 nodes, salt worker_threads count must be increased to 20
  • Above 60 nodes, salt worker_threads count must be increased to 30
  • Above 75 nodes, salt worker_threads count must be increased to 40
  • Above 85 nodes, salt worker_threads count must be increased to 50
  • Above 95 nodes, salt worker_threads count must be increased to 60

To change the variable in the Salt Master configuration, run the following on the Administration Node:

 echo "worker_threads: 20" > /etc/caasp/salt-master-custom.conf
 saltid=$(docker ps | grep salt-master | awk '{print $1}')
 docker kill $saltid

Note: salt master will be automatically restarted by kubelet

Following bootstrapping failure, you can check if Salt worker_threads is too low

docker logs $(docker ps | grep salt-master | awk '{print $1}') 2>&1 | grep -i worker_threads