Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System overload when there are too many OIDs #39

Open
danielmarschall opened this issue Dec 26, 2023 · 4 comments
Open

System overload when there are too many OIDs #39

danielmarschall opened this issue Dec 26, 2023 · 4 comments

Comments

@danielmarschall
Copy link
Owner

danielmarschall commented Dec 26, 2023

Taken from the current TODO file:

Important things:
- OIDplus must be able to handle large amount of OIDs, e.g. 100.000 OIDs in a branch. It don't need to be perfect, but it must not crash the system
  Currently there are the following "hacks" to make the system not crash if there are too many OIDs
	plugins/viathinksoft/raPages/099_object_log/OIDplusPageRaObjectLog.class.php (show max 100 object log entries)
	plugins/frdl/publicPages/altids/OIDplusPagePublicAltIds.class.php (hardcoded to ignore children of 1.3.6.1.4.1.37476.1.2.3.1)
	plugins/viathinksoft/publicPages/000_objects/OIDplusPagePublicObjects.class.php (do not show ANY children if there are more than 1000)
	includes/classes/OIDplusMenuUtils.class.php (do not show ANY children if there are more than 1000)
	plugins/viathinksoft/adminPages/920_nostalgia/export_with_data.php: (hardcoded to ignore children of 1.3.6.1.4.1.37476.1.2.3.1)

See also frdl/oidplus-plugin-alternate-id-tracking#17 for the Alt-ID plugins.

In addition, it turns out that the backup plugin cannot be used. It was slow all the time, but now it is so slow that it throws a HTTP 500....

@danielmarschall
Copy link
Owner Author

danielmarschall commented Dec 27, 2023

Backup plugin

Greatest bottleneck is log , since I have 137.000 entries, and so it does 2x 137.000 entries (fetch user and objects).
Instead I should do a JOIN , and then group manually with PHP , something like

select 

log.id, lobj.object, lusr.username

FROM `oidplus_log` log

LEFT JOIN `oidplus_log_user` lusr on lusr.log_id = log.id
LEFT JOIN `oidplus_log_object` lobj on lobj.log_id = log.id


WHERE log.id > 136951 and log.id < 137000

WIP:

===================================================================
--- OIDplusPageAdminDatabaseBackup.class.php    (Revision 1446)
+++ OIDplusPageAdminDatabaseBackup.class.php    (Arbeitskopie)
@@ -335,6 +335,49 @@
                // Backup logs (Tables log, log_object, log_user)
                $log = [];
                if ($export_log) {
+
+                       $log_tmp = array();
+
+                       $res = OIDplus::db()->query("SELECT    log.id as _log_id, log.unix_ts, log.addr, log.event, ".
+                                                   "          lobj.object, lobj.severity as obj_severity, ".
+                                                   "          lusr.username, lusr.severity as usr_severity ".
+                                                   "FROM      ###log log ".
+                                                   "LEFT JOIN ###log_user   lusr on lusr.log_id = log.id ".
+                                                   "LEFT JOIN ###log_object lobj on lobj.log_id = log.id ");
+                       while ($row = $res->fetch_array()) {
+                               $id = $row['_log_id'];
+                               if (!isset($log_tmp[$id])) {
+                                       $num_rows["log"]++;
+                                       $log_tmp[$id] = [
+                                               "unix_ts" => $row["unix_ts"],
+                                               "addr" => $row["addr"],
+                                               "event" => $row["event"],
+                                               "objects" => [],
+                                               "users" => []
+                                       ];
+                               }
+                               if (($row['object'] ?? '') != '') {
+                                       $num_rows["log_object"]++;
+                                       $log_tmp[$id]['objects'][] = [
+                                               "object" => $row['object'],
+                                               "severity" => $row['obj_severity'] ?? 0
+                                       ];
+                               }
+                               if (($row['username'] ?? '') != '') {
+                                       $num_rows["log_user"]++;
+                                       $log_tmp[$id]['users'][] = [
+                                               "username" => $row['username'],
+                                               "severity" => $row['usr_severity'] ?? 0
+                                       ];
+                               }
+                       }
+
+                       foreach ($log_tmp as $log_id => $data) {
+                               $log[] = $data;
+                       }
+                       unset($log_tmp);
+
+                       /*
                        $res = OIDplus::db()->query("select * from ###log order by id");
                        $rows = [];
                        while ($row = $res->fetch_array()) {
@@ -372,6 +415,7 @@
                                        "users" => $log_users
                                ];
                        }
+                       */
                }
 
                // Backup public/private key

But this does not work for me, because this query loads forever until I get timeout after 5 minutes!!!

SELECT    log.id as _log_id, log.unix_ts, log.addr, log.event, lobj.object, lobj.severity as obj_severity, lusr.username, lusr.severity as usr_severity
FROM      oidplus_log log 
LEFT JOIN oidplus_log_user   lusr on lusr.log_id = log.id 
LEFT JOIN oidplus_log_object lobj on lobj.log_id = log.id 

Note that this query has the disadvantage that event might contain a lot of data and this amount of data is multiplied by the two joins. The same thing would happen (and even worse) if we do a single select on the objects, asn1id and iri table. Then we'd multiply the description field which might be very large.


Same thing to OIDs , where you need to fetch an ASN1ID and IRI for each OID!
It's just not so extreme for myself, because I have more log entries (173k) as OIDs (64k)

... maybe log entries should be pruned? (maybe even prune by severity?) maybe save as TXT before pruning...

... if backup fails, will restore then also fail?

@danielmarschall
Copy link
Owner Author

The whole OIDplus system is a mess when there are 56.000 OIDs.
I tried to use XDebug to do a performance profiling, but I did not get good results.
On SQL Server ODBC, the whole system is unuseable. Disabling the caching (which reads the whole DB into the cache) makes it a little better if the DB is very large.
I am very unhappy with this situation. It seems like OIDplus is not capable for large databases, and I have no idea why.

@wehowski
Copy link
Collaborator

Hello Daniel,
I am not sure if it helps? I made an update, the changes of https://github.com/frdl/oidplus-plugin-alternate-id-tracking are not in the core yet?!?

P.S.: Bitte entschuldige die Verspätung, war (auch offline) ziemlich beschäftigt...

@danielmarschall
Copy link
Owner Author

Unfortunately, the system is even slow without AltID plugin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants