Manual Chapter : Prepare for Logging Node Upgrade with Minimal Downtime

Applies To:

Show Versions Show Versions

BIG-IQ Centralized Management

  • 5.4.0
Manual Chapter

Preparing the logging node cluster for upgrade with minimal downtime

If you choose the minimal downtime method, you can upgrade your logging node cluster minimizing the time that the cluster is offline.

Important: You cannot perform the minimal downtime upgrade unless you have at least 3 logging nodes in your cluster.

Check Logging Node health

You can use the Logging Configuration screen to review the overall health and status of the Logging Nodes you've configured. You can use the data displayed on this screen both before and after an upgrade to verify that your Logging Node cluster configuration is as you expect it to be.
Note: Perform this check on the BIG-IQ®; not on the Logging Node.
  1. At the top of the screen, click System Management.
  2. At the top of the screen, click Inventory.
  3. On the left, expand BIG-IQ LOGGING and then select Logging Configuration.
    The Logging Configuration screen opens to display the current state of the logging node cluster defined for this device.
  4. Record these Logging Node cluster details as listed in the Summary area.
    • Logging Nodes in Cluster
    • Nodes in Cluster
    This information provides a fairly detailed overview that describes the Logging Node cluster you have created to store alert or event data. After you complete an upgrade, you can check the health again, and use this information to verify that the cluster restored successfully.
  5. If there are any cluster health issues, resolve those issues and then repeat the process until the cluster health is as expected.

Confirm disk space is sufficient for upgrade with minimal downtime

As part of preparing to upgrade your logging node with minimal downtime, you must confirm that there is sufficient disk size in the cluster so that when you take a logging node offline, there is room for its data on other devices in the cluster. If the amount of free space in the cluster is less than the amount of data on any one node, then there is insufficient space to upgrade without downtime. If this is the case, you need to either add logging nodes or increase storage space on the existing logging nodes.
  1. Use SSH to log in to a device in the cluster.
    You must log in as root to perform this procedure.
  2. Determine the storage space requirement for your logging node cluster using the following command:
    curl localhost:9200/_cat/allocation?v
    shards disk.indices disk.used disk.avail disk.total disk.percent host ip node 
    57 397.5mb 2gb 7.8gb 9.8gb 20 10.10.10.5 10.10.10.5 8637c04c-1b83-4795-b1f0-347ac733fd10 
    56 471.7mb 2.2gb 7.5gb 9.8gb 23 10.10.10.3 10.10.10.3 9d718ba7-5bb9-4866-9aa3-4677a1f60e46 
    56 393mb 2.1gb 7.7gb 9.8gb 21 10.10.10.2 10.10.10.2 8c4e58b4-a005-404f-9a53-6e318ec0e381 
    57 444.2mb 2gb 7.8gb 9.8gb 20 10.10.10.10 10.10.10.10 11ac40f9-5b13-4f9a-a739-0351858ba571
                    
  3. Analyze the storage space requirement for your logging node to determine if there is sufficient disk space.
    In the previous example, there is plenty of space. The logging node consuming the most data is only consuming 2.2 GB, and each of the other logging nodes has almost 8 GB free. So when that logging node goes offline to upgrade, the system can move the 2.2 GB of data to the remaining 15.5 GB of free space. If these numbers were reversed, so that the logging node consuming the most storage had 7.8 GB of data, and the remaining logging nodes only had 6.3 GB free, there would be insufficient space to move the data when that logging node went offline.
If there is sufficient space, you can proceed. Otherwise, you need to either add logging nodes, or add logging node storage space.

Confirm that BIG-IP logging node configuration is correct

In preparing to upgrade your logging node with minimal downtime, you must confirm that the BIG-IP® device to logging node configuration is correct, so that when a particular logging node is upgraded, data that was being routed to it is automatically routed to another logging node in the cluster. There are two settings of particular concern:
  • Confirm that data sent from the BIG-IP devices is not being sent to just one logging node. Each BIG-IP device must be configured to send data to multiple logging nodes.
  • Confirm that the BIG-IP devices are configured with appropriate monitors that allow for traffic to switch to a different logging node when one logging node is taken offline.
  1. Analyze the data routing configuration for all of the BIG-IP devices that send data to your logging node cluster.
  2. If you find a BIG-IP device that is configured to send data to only one logging node, change that configuration before proceeding.
    Refer to the BIG-IP documentation on support.f5.com for details on how to configure the BIG-IP to logging node routing.
  3. Analyze the monitor configuration for all of the BIG-IP devices that send data to your logging node cluster. Make sure that each device is configured to send data to an alternate logging node if one logging node goes offline.
  4. If you find a BIG-IP device that is not configured with appropriate monitors, change that configuration before proceeding.
    Refer to the BIG-IP documentation on support.f5.com for details on how to configure BIG-IP monitors correctly.

Stop snapshot creation

Because of the mixed software version environment that occurs during the upgrade process, if snapshot schedules are configured for the cluster, you should stop creating snapshots before you begin the upgrade to prevent possible issues with your data.
  1. Use SSH to log in to the primary BIG-IQ system for this cluster.
    You must log in as root to perform this procedure.
  2. Retrieve the list of scheduled snapshots using the following command: restcurl cm/shared/esmgmt/es-snapshot-task | grep task-scheduler
    config # restcurl cm/shared/esmgmt/es-snapshot-task | grep task-scheduler 
    "link": "https://localhost/mgmt/shared/task-scheduler/scheduler/0fdf50ec-8a17-3da9-b717-c63637ccc68a"
    "link": "https://localhost/mgmt/shared/task-scheduler/scheduler/0af33352-2f33-32b3-85cb-1281bb88c249"
    "link": "https://localhost/mgmt/shared/task-scheduler/scheduler/2ad770a8-bdb0-3383-99a9-300846eb0972"
    
    In the example here, there are 3 snapshots scheduled.
  3. Stop each of the schedules using the following command: restcurl -X PATCH -d '{"status":"DISABLED"}' shared/task-scheduler/scheduler/<SNAPSHOT ID>
    #restcurl -X PATCH -d '{"status":"DISABLED"}'
    shared/task-scheduler/scheduler/0af33352-2f33-32b3-85cb-1281bb88c249
    { "id": "0af33352-2f33-32b3-85cb-1281bb88c249", "status":"DISABLED", ...}
    
    After you run the command for each scheduled snapshot, no more snapshots are created.