Responsiveness of Server at high CPU load - Database Discussions
This is a discussion on Responsiveness of Server at high CPU load - Database Discussions ; (Excerpt from a TAR - still open) From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 CPUs) got unusuable at CPU load of 99% as shown by top; in this state, nothing else could be done with ...
![]() |
| | LinkBack | Thread Tools | Display Modes |
|
#1
| |||
| |||
| From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 CPUs) got unusuable at CPU load of 99% as shown by top; in this state, nothing else could be done with Oracle, even trying to connect via sqlplus took about 1 hour (assuming one would wait that long). Processes running were Oracle processes and kswap (meaning that swapping was heavily taking place). Users complain in such a situation and my only remedy has been to reboot the server. pstack and oradebug could not be used. After analyzing lots of things we found out that nothing seems to be wrong with the database - it is just that a very inefficient query is running which blocks the Oracle server and avoids any other activity. Well, one message was found in the alert log, saying ksbsrv: No startup acknowledgement from forked process after 30 seconds but no ORA- error appears. Statspack Reports revealed a unusuable high "process startup" wait time. According to my experience under the Sun/Solaris platform, even if the 4 CPUs of our E3500 are at maximum load (showing an average idle of 0%), the Oracle (8.1.7) server is still available for new sessions (which run of course slower than usual). This happens quite often by the way, so it is a reliable experience. Assuming that the situation is caused by a bad query, I am concerned about the limited responsiveness of the server, since most of our queries are of batch type and run hours in the production platform, which is Sun/Solaris 7. If we transfer the production DB to the new, much faster Intel/Linux platform, we could have heavy trouble when such batch job run. They would be served in a first-in first-out base serialized one after one (limited by the number of CPUs available). Is there a way to adjust priorities or something to guarantee an even distribution of computing power of the Oracle server? Is this more a operating system problem than it is an oracle one? (Note: at the OS level, reactivity is much better). We use RedHat Linux AS 2.1 with asynch_io=true. This is supposed to be a certified environment (Dell Power Edge 2650) for enterprise use of Oracle. Oracle Corp. is quite clueless until now, so my question to the forum. Thanks in advance Rick Denoire |
|
#2
| |||
| |||
|
I wonder if Linx has a problem? There is another similar posting indicating a lockup under heavy load similar to your symptoms. Is it possible to bench mark Solaris for x86 within your envirnoment. The only problem is that only Oracle 8i is available for Solaris 8/9 for x86. "Rick Denoire" <100.17706@germanynet.de> wrote in message news:bptutv4qmvps97fbo234irlh7dhd8pqg4m@4ax.com... > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > |
|
#3
| |||
| |||
|
I wonder if Linx has a problem? There is another similar posting indicating a lockup under heavy load similar to your symptoms. Is it possible to bench mark Solaris for x86 within your envirnoment. The only problem is that only Oracle 8i is available for Solaris 8/9 for x86. "Rick Denoire" <100.17706@germanynet.de> wrote in message news:bptutv4qmvps97fbo234irlh7dhd8pqg4m@4ax.com... > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > |
|
#4
| |||
| |||
|
I wonder if Linx has a problem? There is another similar posting indicating a lockup under heavy load similar to your symptoms. Is it possible to bench mark Solaris for x86 within your envirnoment. The only problem is that only Oracle 8i is available for Solaris 8/9 for x86. "Rick Denoire" <100.17706@germanynet.de> wrote in message news:bptutv4qmvps97fbo234irlh7dhd8pqg4m@4ax.com... > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > |
|
#5
| |||
| |||
|
In this case, you may want to provide some memory information. What is the RAM size and how much of it Oracle is using? The heavy swapping could be an indication of memory shortage. On Linux systems I have also seen kswap going crazy even there is no memory shortage. I would suggest you to stay with major UNICES, at least for production systems. Despite of all the hypes around Linux, there are still many things to be improved, memory management being one of them. Hopefully we will see some of them in 2.6. "Rick Denoire" <100.17706@germanynet.de> wrote in message news:bptutv4qmvps97fbo234irlh7dhd8pqg4m@4ax.com... > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > |
|
#6
| |||
| |||
|
In this case, you may want to provide some memory information. What is the RAM size and how much of it Oracle is using? The heavy swapping could be an indication of memory shortage. On Linux systems I have also seen kswap going crazy even there is no memory shortage. I would suggest you to stay with major UNICES, at least for production systems. Despite of all the hypes around Linux, there are still many things to be improved, memory management being one of them. Hopefully we will see some of them in 2.6. "Rick Denoire" <100.17706@germanynet.de> wrote in message news:bptutv4qmvps97fbo234irlh7dhd8pqg4m@4ax.com... > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > |
|
#7
| |||
| |||
|
In this case, you may want to provide some memory information. What is the RAM size and how much of it Oracle is using? The heavy swapping could be an indication of memory shortage. On Linux systems I have also seen kswap going crazy even there is no memory shortage. I would suggest you to stay with major UNICES, at least for production systems. Despite of all the hypes around Linux, there are still many things to be improved, memory management being one of them. Hopefully we will see some of them in 2.6. "Rick Denoire" <100.17706@germanynet.de> wrote in message news:bptutv4qmvps97fbo234irlh7dhd8pqg4m@4ax.com... > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > |
|
#8
| |||
| |||
|
Rick Denoire wrote: > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > A couple of Notes: on MetaLink regarding HANGANALYZE are available. Doc ID: Note:175006.1 Subject: Steps to generate HANGANALYZE trace files Type: BULLETIN Status: PUBLISHED Content Type: TEXT/PLAIN Creation Date: 04-FEB-2002 Last Revision Date: 10-DEC-2003 |
|
#9
| |||
| |||
|
Rick Denoire wrote: > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > A couple of Notes: on MetaLink regarding HANGANALYZE are available. Doc ID: Note:175006.1 Subject: Steps to generate HANGANALYZE trace files Type: BULLETIN Status: PUBLISHED Content Type: TEXT/PLAIN Creation Date: 04-FEB-2002 Last Revision Date: 10-DEC-2003 |
|
#10
| |||
| |||
|
Rick Denoire wrote: > (Excerpt from a TAR - still open) > > From time to time, our Oracle test server (9.2.0.4 on Intel/Linux, 2 > CPUs) got unusuable at CPU load of 99% as shown by top; in this state, > nothing else could be done with Oracle, even trying to connect via > sqlplus took about 1 hour (assuming one would wait that long). > Processes running were Oracle processes and kswap (meaning that > swapping was heavily taking place). > > Users complain in such a situation and my only remedy has been to > reboot the server. pstack and oradebug could not be used. After > analyzing lots of things we found out that nothing seems to be wrong > with the database - it is just that a very inefficient query is > running which blocks the Oracle server and avoids any other activity. > Well, one message was found in the alert log, saying > ksbsrv: No startup acknowledgement from forked process after 30 > seconds > but no ORA- error appears. > Statspack Reports revealed a unusuable high "process startup" wait > time. > > According to my experience under the Sun/Solaris platform, even if the > 4 CPUs of our E3500 are at maximum load (showing an average idle of > 0%), the Oracle (8.1.7) server is still available for new sessions > (which run of course slower than usual). This happens quite often by > the way, so it is a reliable experience. > > Assuming that the situation is caused by a bad query, I am concerned > about the limited responsiveness of the server, since most of our > queries are of batch type and run hours in the production platform, > which is Sun/Solaris 7. If we transfer the production DB to the new, > much faster Intel/Linux platform, we could have heavy trouble when > such batch job run. They would be served in a first-in first-out base > serialized one after one (limited by the number of CPUs available). > > Is there a way to adjust priorities or something to guarantee an even > distribution of computing power of the Oracle server? Is this more a > operating system problem than it is an oracle one? (Note: at the OS > level, reactivity is much better). We use RedHat Linux AS 2.1 with > asynch_io=true. This is supposed to be a certified environment (Dell > Power Edge 2650) for enterprise use of Oracle. > > Oracle Corp. is quite clueless until now, so my question to the forum. > > Thanks in advance > > Rick Denoire > A couple of Notes: on MetaLink regarding HANGANALYZE are available. Doc ID: Note:175006.1 Subject: Steps to generate HANGANALYZE trace files Type: BULLETIN Status: PUBLISHED Content Type: TEXT/PLAIN Creation Date: 04-FEB-2002 Last Revision Date: 10-DEC-2003 |
![]() |
« Previous Thread
|
Next Thread »
| Thread Tools | |
| Display Modes | |
| |
All times are GMT -4. The time now is 09:33 AM.




Linear Mode