Carol provides tips for what to examine to help determine if your system’s been breached.
This discussion stems from a conversation I had with a friend who asked what information is available on IBM i to help an organization determine whether they had been breached. These tips are not meant to be a replacement for a formal breach investigation service. In fact, I highly recommend that you develop an incident response plan and consider putting an outside firm on retainer in the event you’re breached. The information I provide here needs to be a part of a larger plan but may be helpful in determining whether you need to call in outside resources for further analysis...or help you come to the conclusion that nothing happened.
What Does the Audit Journal Say?
As you read this discussion, you’ll find that what you’ve configured for auditing and how long you’re keeping your audit journal receivers are key to how completely you can perform an investigation. For example, the first thing you’ll need to do is establish a timeframe for your investigation. If you’ve been told that your data has appeared on the dark web, the timeframe of your investigation may need to be vast…as in weeks or months. If that happened to you, how far back could you go to perform analysis? In other words, taking into consideration what you have in long-term storage, how far back could you investigate? If it’s only two weeks, that’s likely not long enough to perform this type of an investigation. Contrast that with an investigation of activity that occurred while a firewall was down, leaving your network open to direct access for a few hours. Hopefully, you would be alerted to this situation quickly, such that you may have the audit journal receivers on your system and not have to pull back any from storage.
Once you’ve established the time frame and restored any audit journal receivers required to support that time frame, you can start looking around. The first thing you want to establish is whether there were any changes to what was being audited. I would look for changes prior to the time frame you’re investigating. For an investigation that spans several months, you may want to pull the SV (system value) audit journal entries, looking for changes to the QAUDCTL and QAUDLVL system values for at least a month prior. For an investigation spanning a few hours, it may be sufficient to look for changes 24–48 hours prior to the start of the time frame. Continue to look for changes throughout the time frame being investigated. Why look for changes to these two system values? To determine if auditing was turned off for a period of time and/or actions weren’t being audited. In other words, you need be able to know that you have a contiguous and full set of audit journal entries.
If you have *JOBDTA or *JOBBAS specified in QAUDLVL, the first set of audit entries I’ll look at are the JS entries. *JOBDTA or the subset, *JOBBAS, logs the start, stop, release, and hold of every job on the system. The volume of entries this value generates on busy systems causes many organizations to leave this value out of QAUDLVL, but this information is invaluable when looking for inappropriate activity. With the information in the JS entry, you can determine if there were any connections originating from IP addresses that you don’t recognize (such as connections coming from external IP addresses when the only connections should be internal IP addresses) or processes started using profiles you don’t expect (such as FTP transfers running with a service account created for ODBC connections).
The next audit journal entries I look at are the PW (password) entries. The first set I examine are the PW subtype P entries. These are attempts to authenticate with an incorrect password. Note that this is not just users attempting to sign on to a green-screen. Attempts to log into any interface (FTP, ODBC, SSH, etc.) as well as attempts to sign on to the traditional signon display are logged. I look at these entries to determine if someone is trying to access the system using the well-known QSECOFR, QPGMR, QUSER, etc. system-provided user profiles, I also check for a more systematic attack, looking for several attempts to find a valid user ID/password combination but stopping short of disabling the profile. (If the attacker had found a valid combination, they would have been logged in, which would have resulted in a JS entry.) Finally, I look for PW U entries. These entries are attempts to sign on with an incorrect user profile name. While looking at these entries, you’ll get your share of people who have simply mistyped their profile name or typed their password into the User field, but you’ll also see if someone has tried to attack your system using an automated bot. You’ll likely see authentication attempts to use the profile name of ROOT or ADMIN (the equivalent of QSECOFR in UNIX/Linux and Windows, respectively) and attempts to authenticate via the FTP or SSH (Secure Shell Daemon). You can also see if someone has systematically attempted to find valid user names. Side note: If you have a SIEM, I think it’s important to send all PW entries to that device. Unless someone is targeting a specific IBM i partition, it’s likely that attacks will hit many systems across your organization. Detecting that this activity is occurring on multiple systems and types of operating systems may allow your organization to more quickly spot an attack.
Another feature that may help you detect a widespread event is called Intrusion Detection and is enabled by specifying *ATNEVT in the QAUDLVL system value. This feature detects attacks at the IP stack level. While you can be alerted to these entries via email or message, you can also send the corresponding IM audit entries to your SIEM to assist in detecting widespread attacks. More information on Intrusion Detection on IBM i can be found in the IBM i Information Center: https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_74/rzaub/rzaubpdf.pdf?view=kc
What Do You Do If You Detect What You Feel Is Inappropriate Access?
If you see evidence that someone has gained access to your environment, the next step is to attempt to determine what actions were taken. If object auditing has been enabled on your objects and, in particular, on your database files, you can pull the ZC audit entries and correlate the dates, times, and profiles discovered previously with the updates to objects. Unfortunately, many organizations haven’t configured the object auditing to include the reads of objects, but if you have, you can examine the ZR entries as well. Another place to look for activity, especially if you have no object auditing enabled, is in the database journals. Again, most organizations don’t log the reads of records, so you may only have evidence of records that were updated or deleted during the timeframe you’re investigating.
You may also want to look at the DO (deletion of objects), CO (creation of objects), and OM (object moves or renames) as well as the CP (creation of and changes to user profiles) audit journal entries. The challenge in looking at all of these audit entries will be to separate normal activity from abnormal.
Last but certainly not least is to examine the logs of your exit point software. Unless the attacker is an insider, it’s unlikely that someone is going to save a database file directly from IBM i and walk away with the media. So how do you determine what information was read and removed from (or uploaded to) your system? This is where exit point software provides a valuable piece of the puzzle you’re assembling. For each exit point that has a program registered and for which logging is enabled, you can review the transactions that occurred during the timeframe you’re investigating, narrowing down your search using the information previously gathered from the audit journal. Depending on what you’re logging, you should be able to see which files were downloaded or uploaded. The information provided in these logs is invaluable during this type of investigation because it fills in the gaps left by the audit journal. For example, as noted above, most organizations don’t have the reads of objects configured, so an exit program log will show when a file was downloaded, which protocol was used, the profile that authenticated, and the originating IP address. Even if object auditing is enabled, you’d see in the ZR entry that the file was read using ODBC but you wouldn’t have the exact SQL used.
What Do You Have Available if You Don’t Have Auditing Enabled or Exit Point Software?
The history log (QHST) is available if you have nothing else, but that will only provide successful connections, and depending on the connection type, you may or may not get the IP address. While QHST lists attempts to use an incorrect password, it doesn’t log attempts using an incorrect user name. In other words, you may get a peek into inappropriate activity, but there’s no way to get the full picture by only analyzing the history log.
What If You Find an Active Connection?
If you find a connection to your system that you determine is not appropriate, your first reaction may be to terminate that connection immediately. But is that really the right thing to do? It’s one thing to find past activity; it’s another to discover it’s currently happening. If you find an active connection, your incident response team must make the call as to whether the connection is immediately terminated. They may want to try to determine the origin of the connection and/or work with law enforcement to discover who’s behind the attack. Immediately terminating the connection may ruin that opportunity. You absolutely cannot make this decision by yourself nor should you be making any sort of configuration changes during the investigation that could tip off the intruder or jeopardize evidence preservation. This underscores the need to do “tabletop” exercises that allow you to go through different scenarios of an attack so that the incident response team is aware of the types of actions that are appropriate for each scenario and you know exactly what steps to take should you find an active connection.
Summary
This discussion was not meant to be an exhaustive list of everything that can be examined on IBM i during an investigation. My goals with this article were to get you to start thinking about what information you would have available in your environment should you be called upon to do an investigation and to make sure you’ve got an incident response plan in place that includes IBM i. After reading this article and analyzing your current IBM i configuration, you may decide to enable more auditing, or send audit information to your SIEM, or purchase an exit point solution, simply to have more activity logged even if you don’t want to ever add rules to restrict access. If the configuration of your IBM i is lacking—in other words, you wouldn’t have sufficient information to do an investigation—I encourage you to act now since it’s (obviously) too late once an event occurs!
LATEST COMMENTS
MC Press Online