to select ↑↓ to navigate
Cloud

Cloud

Performance/Error Debugging

Often times you may face certain errors within your site. This page aims to be a guide to most errors faced by utitizing the tools provided in Frappe Cloud and Frappe Framework

Site Slow: Daily Usage limit reached

This happens when you exceed CPU hours allotted for your site. If you're confused as to how you reached your CPU hours limit, you can check the Insights tab of your site.

Here, you can scroll down to Advanced analytics to see:the Slowest Requests and Slowest Background Jobs graphs. This will give you an idea of which endpoints in your site take most time/requests. We can take a look at the following graphs as an example

Here, the red bars are seem to take relatively long and should be looked into.

It is sorted in descending order, so the first endpoints in the list are usually slowest.

Site Slow: 504 Gateway timeout

This can happen when web workers on your site are all busy with previous requests. This can even cause a bench to go down! This is caused by slow apis. Most of the time these are reports which take too long to run. You can confirm the same from your analytics page by looking at Slowest Requests chart as shown above.

Some common endpoints and their meanings are given below

Endpoint Meaning
/api/method/frappe.desk.query_report.run Reports from Report doctype
/api/method/frappe.desk.reportview.get Loading of report or List view of a doctype. In case lot of columns are being fetched with filters on various others, it can get slow depending on indexes.
/api/method/run_doc_method This indicates a whitelisted method in a Document controller is being called

You can also use frappe's built in Recorder in your site to figure out what's wrong. Remember to turn it off once you're done to prevent slowing down your site further.

If the endpoint is not something you can optimize, you can try converting the same into a background job.

If you own a dedicated server, you should also check your server analytics to see if you're reaching CPU limits for either of your servers (Application or Database)

Slow reports

if you see /api/method/frappe.desk.query_report.run at the top of the list. This is a good indication that you can convert such reports into Prepared Reports so they run in background and allow you to freely use your site.

What does "Other" mean in chart

“Other” bucket is attributed to all requests that aren’t the top slowest requests. When “Other” is most prominent, it usually means a specific pattern of endpoint is slow. For example:

Here, the /custom_app/view/ endpoint is slow and should be investigated by the developer for the same.

If you don't see a pattern of sorts and "Other" is still the slowest endpoint, then it's likely that the server itself is slow and should be looked into.

500 Internal Server Error

This usually indicates an application related issue. If your site is on custom bench group, then you can investigate the same with:

Checking bench logs (Option 1)

  1. Go to your site's Bench Group dashboard and check Sites tab

  1. Click on View Logs

  1. Open web.error.log and scroll to bottom. You should be able to see the traceback:

In this example, the site can't connect to the database server. Restarting the database server might a solution here.

Using SSH Access (Option 2)

Alternatively, you can locate web.error.log after SSH Access. Please check the docs here

If you occasionally get a pop-up with the Internal Server Error message, it is likely that a background job is failing. In such cases, checking your Scheduled Job Log, Error Log and worker.err.log file should help.

502 Bad Gateway

This error usually happens when your web worker processes have completely stopped. It can happen due to various reasons. To debug, you should check your logs the same way as above. You can also try to bring the process back up with:

bench restart

What's causing request timed out error?

If a particular action in your site (not all), say submission of a document takes too long and eventually ends with a Request Timed Out popup, it's an application issue assuming normal functioning of the server. In most cases we can't do much other than try increasing the default http timeout of 2 minutes of web requests.

Here, the slowness could be in your python application or be due to slow queries.

If the action you're performing is part of your custom app, we'd suggest you look into try and optimizing the code so that it finishes faster. If you're pressed for time, you may also run the particular action from bench console after ssh as a workaround.

If the action is guaranteed to take long, consider converting the same to a background job.

On the off chance that the app is not part of custom app and all other activities in the site are going smoothly, please reach out to ERPNext Support for help.

Request Timeout: Server was too busy to process this request

This happens when a SQL query times out due to not getting a lock. This indicates a bug in the application. Some other job may also be acquiring a lock on a related table, causing the issue. Any recent controller hook or scheduled job added should be reviewed.

One easy way to debug this is to perform the action that triggers it and while it is happening, check the processlist of your site to see which queries are running.

Checking slow queries is also a good idea.

Work-horse terminated unexpectedly; Waitpid returned 9/15 (signal 9/15)

You may see this as the output of RQ Job . This happens when a background worker gets killed. Usually by the OOM Killer as the result of consuming too much memory. In such cases, you may consider optimizing your code to use less memory. If that is not possible, you'll have to upgrade your application server for more memory.

Last updated 7 hours ago
Was this helpful?
Thanks!