2008-08-19 Tue

I’d like to share some great news — The Pythian Group and Open Query have become partners!
Open Query is a leading provider of high-quality MySQL, PostgreSQL and related training in Australia and New Zealand. They offer consulting services too, and are also known for their MySQL Graph Storage Engine. Feel free to browse through Open Query web-site for more info.
Open Query was founded by Arjen Lentz, who was employee number 25 at MySQL AB. If you follow the MySQL community then I’m sure you already read Arjen’s blog.
Since you’re reading this blog, I guess you probably already know what Pythian does, but if you want to learn more, please click through to our home page.
Together with Open Query, we are going to extend our service offerings and strengthen our positions in outsourced database management services, consulting, and training.
Continuing my experiments with our 2 Node 10g RAC Test system, I carried out upgrade of Oracle Clusterware and Oracle RAC Database from 10.2.0.1 to 10.2.0.4. I have tried to document the steps for upgrading Oracle Clusterware(Rolling Upgrade) and RAC Database upgrade in this post. In case you observe any mistakes, please let me know (I am still learning RAC and bound to make mistakes
)
First step is to download the 10.2.0.4 Patchset from metalink. In our case ,we downloaded Patch 6810189 (10g Release 2 (10.2.0.4) Patch Set 3 for Linux x86). You can follow Patch Readme for detailed steps.
We will be doing Rolling upgrade for Oracle Clusterware i.e we will only bring one node down for patching while other node will be available and accepting database connections.
1)Shutdown the DBconsole and Isqlplus
$ emctl stop dbconsole
$ isqlplusctl stop
2) Shutdown the associated service on the node
[oracle@blrraclnx1 ~]$ srvctl stop service -d orcl -s orcl_taf -i orcl1
3) Shutdown Database Instance and ASM instance on node (if present)
[oracle@blrraclnx1 ~]$ srvctl stop instance -d orcl -i orcl1
To stop ASM, use following command
[oracle@blrraclnx1 ~]$ srvctl stop asm -n blrraclnx1
4)Next step is to stop Nodeapps services on the node
[oracle@blrraclnx1 ~]$ srvctl stop nodeapps -n blrraclnx1
Before proceeding to installing Oracle Clusterware Patch, let’s confirm if services have been stopped
HA Resource Target State ----------- ------ ----- ora.blrraclnx1.ASM1.asm OFFLINE OFFLINE ora.blrraclnx1.LISTENER1_BLRRACLNX1.lsnr OFFLINE OFFLINE ora.blrraclnx1.gsd OFFLINE OFFLINE ora.blrraclnx1.ons OFFLINE OFFLINE ora.blrraclnx1.vip OFFLINE OFFLINE ora.blrraclnx2.ASM2.asm ONLINE ONLINE on blrraclnx2 ora.blrraclnx2.LISTENER1_BLRRACLNX2.lsnr ONLINE ONLINE on blrraclnx2 ora.blrraclnx2.gsd ONLINE ONLINE on blrraclnx2 ora.blrraclnx2.ons ONLINE ONLINE on blrraclnx2 ora.blrraclnx2.vip ONLINE ONLINE on blrraclnx2 ora.orcl.db ONLINE ONLINE on blrraclnx2 ora.orcl.orcl1.inst OFFLINE OFFLINE ora.orcl.orcl2.inst ONLINE ONLINE on blrraclnx2 ora.orcl.orcl_taf.cs ONLINE ONLINE on blrraclnx2 ora.orcl.orcl_taf.orcl1.srv OFFLINE OFFLINE ora.orcl.orcl_taf.orcl2.srv ONLINE ONLINE on blrraclnx2
5)Set DISPLAY variable and execute runinstaller from Patch Directory
[oracle@blrraclnx1 Disk1]$ ./runInstaller
This will open OUI screen. Select Oracle Clusterware Home for Patching. Find below screenshot for same
This will automatically select all the nodes available in cluster and propogate patch binaries to the other node.
6) On the Summary screen, click Install.OUI will prompt you now to run, following two scripts as Root which will upgrade Oracle Clusterware
# $ORA_CRS_home/bin/crsctl stop crs
# $ORA_CRS_home/install/root102.sh
Now we need to repeat the steps 1-4 and step 6 on Node 2. Step 5 is not required as binaries have been already copied over to node 2.
RAC Database Patching cannot be done in a rolling fashion and requires Database to be shutdown.
1)Shutdown the DBconsole and Isqlplus
$ emctl stop dbconsole
$ isqlplusctl stop
2) Shutdown the associated service for database
[oracle@blrraclnx1 ~]$ srvctl stop service -d orcl
3) Shutdown Database Instance and ASM instance on node (if present)
[oracle@blrraclnx1 ~]$ srvctl stop database -d orcl
To stop ASM, use following command on both the nodes
[oracle@blrraclnx1 ~]$ srvctl stop asm -n blrraclnx1 [oracle@blrraclnx1 ~]$ srvctl stop asm -n blrraclnx2
4)Next step is to stop Listener on both the nodes
[oracle@blrraclnx1 ~]$ srvctl stop listener -n blrraclnx1 -l LISTENER1_BLRRACLNX1 [oracle@blrraclnx1 ~]$ srvctl stop listener -n blrraclnx2 -l LISTENER1_BLRRACLNX2
5)Set DISPLAY variable and execute runinstaller from Patch Directory
[oracle@blrraclnx1 Disk1]$ ./runInstaller
This will open OUI screen. Select Database Home for Patching.
6) On the Summary screen, click Install.When prompted, run the $ORACLE_HOME/root.sh script as the root
user on both the nodes. On completion of this , we need to perform post installation steps.
7)Start listener and ASM Instance on both the nodes
[oracle@blrraclnx1 ~]$ srvctl start listener -n blrraclnx1 -l LISTENER1_BLRRACLNX1
[oracle@blrraclnx1 ~]$ srvctl start listener -n blrraclnx2 -l LISTENER1_BLRRACLNX2
[oracle@blrraclnx1 ~]$ srvctl start asm -n blrraclnx1
[oracle@blrraclnx1 ~]$ srvctl start asm -n blrraclnx2
8)For Oracle RAC Installation, we need to set CLUSTER_DATABASE=FALSE before upgrading
[oracle@blrraclnx1 ~]sqlplus "/ as sysdba"
SQL>startup nomount
SQL> alter system set cluster_database=false scope=spfile;
System altered.
SQL>shutdown immediate;
SQL>startup upgrade
SQL>spool 10204patch.log
SQL>@?/rdbms/admin/catupgrd.sql
SQL>spool off
Log file needs to be reviewed for any errors. catupgrd.sql took 42 minutes on my system. In case CLUSTER_DATABASE parameter is not set to False, you will get following error while starting database in upgrade mode
ORA-39701: database must be mounted EXCLUSIVE for UPGRADE or DOWNGRADE
We need to Restart the database now and run utlrp.sql.
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP
SQL> @?/rdbms/admin/utlrp.sql
Confirm whether Database has been upgraded successfully by querying DBA_REGISTRY;
select comp_name,version,status from dba_registry;
Now set Cluster_database parameter to TRUE and start Database
SQL>alter system set cluster_database=true scope=spfile;
SQL>Shutdown immediate;
[oracle@blrraclnx1 ~]$ srvctl start database -d orcl
[oracle@blrraclnx1 ~]$ srvctl start service -d orcl
To upgrade DBConsole, run following command
emca -upgrade db -cluster
This completes the upgrade process.
Related Posts
While going through the routine checks from Grid Control, I found a critical alert stating “clusterware integrity check failed” and by clicking on this message it says that there is problem with some metric collections on RAC environment.
To check the node reachability status following query was run:
$ $CRS_HOME/bin/cluvfy comp nodecon -n all
This will check the internode connectivity for all nodes in the cluster. It came out with following message:
$ $CRS_HOME/bin/cluvfy comp nodecon -n all Verifying node connectivity Verification of node connectivity was unsuccessful on all the nodes.
Even the CRS component check was unsuccessful:
$ $CRS_HOME/bin/cluvfy comp crs -n all
It came out with the following message:
$ $CRS_HOME/bin/cluvfy comp crs -n all Verifying CRS integrity Verification of CRS integrity was unsuccessful on all the nodes.
After this it was quite obvious to check the CRS status:
$ crsctl check crs CSS appears healthy CRS appears healthy EVM appears healthy
$crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.orcl.db application ONLINE ONLINE rac1 ora....11.inst application ONLINE ONLINE rac1 ora....12.inst application ONLINE ONLINE rac2 ora....vice.cs application ONLINE ONLINE rac2 ora....l1.srv application ONLINE ONLINE rac1 ora....l1.srv application ONLINE ONLINE rac2 ora....SM1.asm application ONLINE ONLINE rac1 ora....DC.lsnr application ONLINE ONLINE rac1 ora....idc.gsd application ONLINE ONLINE rac1 ora....idc.ons application ONLINE ONLINE rac1 ora....idc.vip application ONLINE ONLINE rac1 ora....SM2.asm application ONLINE ONLINE rac2 ora....C2.lsnr application ONLINE ONLINE rac2 ora....dc2.gsd application ONLINE ONLINE rac2 ora....dc2.ons application ONLINE ONLINE rac2 ora....dc2.vip application ONLINE ONLINE rac2
$$CRS_HOME/bin/olsnodes rac1 rac2
This confirmed that the CRS install is valid, but the question now is why the cluster verification utility (CVU) was failing?
To find the reason I enabled the tracing of CVU as:
$export SRVM_TRACE=true
It will set the environment variable SRVM_TRACE to true and tracing of CVU will generate a trace file under $CRS_HOME/cv/log with name like “cvutrace.log.X”
After setting this and again running $CRS_HOME/bin/cluvfy comp crs -n all trace file with name cvutrace.log.0 was generated.
And a message in cvutrace.log like
"ksh: CVU_10.2.0.2_dba/exectask.sh: cannot execute"
Now its is clear that oracle is not able to execute exectask.sh and cheking the permission and ownership of exectask.sh:
$CRS_HOME/cv/remenv ls -ltr -rw-r--r-- 1 oracle dba 184 Jan 9 2008 exectask.sh -rw-r--r-- 1 oracle dba 268386 Jan 9 2008 exectask
The permission of these two files was changed. After changing the permission back to 755 CUV was showing correct results.
$chmod 755 exectask*
It is still not discovered how the permission of these files got changed.
Related Posts
2008-08-18 Mon
Today I worked on rather interesting customer problem. Site was subject what was considered DDOS and solution was implemented to protect from it. However in addition to banning the intruders IPs it banned IPs of web services which were very actively used by the application which caused even worse problems by consuming all apache slots which were allocated to the problem. Here are couple of interesting lessons one can learn from it.
Implement proper error control In reality it took some time to find what was the issue because there was no error reporting for situation of unavailable web services. If log would be flooded with messages about web services being unavailable it would be much easier to find.
User Curl PHP Has a lot of functions which can accept URL as parameter and just fetch the data transparently for you. They however do not give you good error control and timeout management compared to curl module. Use that when possible it is easy. You can implement your own class to fetch required URL with single call while having all needed timeout handling and reporting to match your application needs. If you’re using PHP functions make sure default_socket_timeout has proper value or set it per session.
Set Curl Timeouts Set both TIMEOUT and CONNECT_TIMEOUT as these apply to different connection stages and just setting timeout is not enough.
Beware of PHP sessions “files” handler I already wrote about this topic, but when troubleshooting this all takes another angle. Default file handler means file gets locked while PHP request is being served. In this case because of network stall request could be taking 100+ seconds. Users are inpatient and do not wait so long pressing reload multiple times… which just adds to the list of users waiting on session file lock. This not only makes apache slots consumed at much higher pace but makes it harder to find what exactly is causing the lock because most of offending processes you can find from apache “server-status” will be just waiting on file to be unlocked. I used “gdb” to connect to the process showing high number of seconds since start finding where it is stuck. If it is somewhere in curl module (or mysql - waiting on long query to come back) - this is our query if it is waiting on the session file lock we can get that file and use fuser to see what other processes are using that files - these would be either waiting on locks or owning the lock and so one of them is the process we’re looking for. Things are much easier with say memcached session storage - this does not cause any locks for parallel session use so only the process which actually stalls waiting on external resource will show high number of seconds since request start.
Entry posted by peter | 3 comments
So, you have a binlog. You want to find out something specific that happened inside of it. What to do? mysqlbinlog has some neat features, which I thought we would look at here.
I should first explain what mysqlbinlog really is. It is a tool that lets you analyze and view the binlogs/relaylogs from mysql, which are stored in binary format. This tool converts them to plaintext, so that they’re human-readable.
For the first tip, let’s start with the --read-from-remote-server option, which allows you to examine a binlog on a master server in order, perhaps, to dump it onto your slave and compare master/slave logs for potential problems*.
$ mysqlbinlog --read-from-remote-server -uwesterlund -p mysql-bin.000001 -h 127.0.0.1 -P 3306 | head -5 Enter password: /*!40019 SET @@session.max_insert_delayed_threads=0*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 4 #080815 19:25:23 server id 101 end_log_pos 107 Start: binlog v 4, server v 6.0.5-alpha-log created 080815 19:25:23 at startup
Pretty useful!
Now, let’s assume we have a binlog that is 94 lines long*:
There are quite a few “tuning primers” and “my.cnf generators” and “sample my.cnf files” online. The ultimate tool for generating an optimal my.cnf is not a tool. It’s a human with many years of experience, deep knowledge of MySQL and the full application stack, and familiarity with your application and your data.
I don’t know exactly the percentage, but quite a few of the servers I take a look at have been “optimized” with some tuning primer or question-and-answer script that spits out “optimal” parameters for my.cnf.
Most of the time these servers are far from optimal. Sometimes the my.cnf parameters are extremely wrong, to the point of causing a severe performance penalty.
If it were as easy as writing a tool to do this, don’t you think Maatkit would have mk-optimal-mycnf already? In my opinion — as someone who knows very well the complexity of creating a good my.cnf — it’s practically impossible. Much harder than syncing data, or manipulating a replication hierarchy, or any of the other things Maatkit can do already. And I doubt I’ll ever even feel motivated to try creating such a tool.
Don’t bother with scripts. Don’t waste your time with most of the advice you see on the web in forums — much of it is fundamentally wrong, even when it seems to come from an informed source. Don’t put too much faith in the my.cnf samples that come with your operating system; many of them have very bad advice in the comments, such as instructing you on how to set up replication in ways that guarantee breakage.
If you want solid advice, ask someone who knows what they’re doing (and can prove it). Or buy our book.
But even more fundamentally, you should not focus so much on my.cnf. It is not the be-all and end-all of performance. Tuning your server settings has far less impact on performance than tuning your schema, indexing, queries and — you guessed it — thinking deeply about your application architecture. Server settings are a distraction and a waste of time for most people.
Most my.cnf files I see only need minor tweaks, which give only so-so performance improvements. Tuning my.cnf only helps a lot when my.cnf has extremely bad parameters. The kind you’ll get from tuning primers and automated my.cnf optimization scripts.
Entry posted by Baron Schwartz | 13 comments
How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala, the social online storage service which scales as new nodes join the P2P network.
The goal of Wua.la is to provide distributed online storage that is:
- large
- scalable
- reliable
- secure
by harnessing the idle resources of participating computers.
This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995:
"The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader"
After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the result: Wuala. Wuala is a new way of storing, sharing, and publishing files on the internet. It enables its users to trade parts of their local storage for online storage and it allows us to provide a better service for free. In this Google Tech Talk, Dominik will explain what Wuala is and how it works, and he will also show a demo.
2008-08-17 Sun
Update: Dormando asks Should you use memcached? Should you just shard mysql more?. The idea of caching is the most important part of caching as it transports you beyond a simple CRUD worldview. Plan for caching and sharding by properly abstracting data access methods. Brace for change. Be ready to shard, be ready to cache. React and change to what you push out which is actually popular, vs over planning and wasting valuable time.
Feedster's François Schiettecatte wonders if Fotolog's 21 memcached servers wouldn't be better used to further shard data by adding more MySQL servers? He mentions Feedster was able to drop memcached once they partitioned their data across more servers. The algorithm: partition until all data resides in memory and then you may not need an additional memcached layer.
Parvesh Garg goes a step further and asks why people think they should be using MySQL at all?
Related Articles
2008-08-16 Sat
Pre-generating static files is an oldy but a goody, and as Thomas Brox Røst says, it's probably an underused strategy today. At one time this was the dominate technique for structuring a web site. Then the age of dynamic web sites arrived and we spent all our time worrying how to make the database faster and add more caching to recover the speed we had lost in the transition from static to dynamic.
Static files have the advantage of being very fast to serve. Read from disk and display. Simple and fast. Especially when caching proxies are used. The issue is how do you bulk generate the initial files, how do you serve the files, and how do you keep the changed files up to date? This is the process Thomas covers in his excellent article Serving static files with Django and AWS - going fast on a budget", where he explains how he converted 600K thousand previously dynamic pages to static pages for his site Eventseer.net, a service for tracking academic events.
Eventseer.net was experiencing performance problems as search engines crawled their 600K dynamic pages. As a solution you could imagine scaling up, adding more servers, adding sharding, etc etc, all somewhat complicated approaches. Their solution was to convert the dynamic pages to static pages in order to keep search engines from killing the site. As an added bonus non logged-in users experienced a much faster site and were more likely to sign up for the service.
The article does a good job explaining what they did, so I won't regurgitate it all here, but I will cover the highlights and comment on some additional potential features and alternate implementations...
2008-08-15 Fri
Welcome to the 110th edition of Log Buffer, the weekly review of database blogs.
From the MySQL side, Jeremy Zawodny gets things going this week. He writes about his doubts over the long-term performance of InnoDB, specifically the cost of multiversion concurrency control, particularly in a master-slave arrangment or a DW. Jeremy comments, “[The] disk bloat, fragmentation, and ongoing degradation in performance may be an argument for having some slaves that keep the same data in MyISAM tables.” His readers, however, point out some diagnostics and tools to remedy this concern.
Not that MyISAM is without foibles. Case in point, on the MySQL Performance Blog, Peter Zaitsev says, Beware of MyISAM Key Cache mutex contention. “. . . inserting in parallel into different tables when indexes fit in memory results in very bad contention causing hundreds of thousands of context switches per second with far less work done than one would hope.” Happily, as Peter reports, multiple key caches mitigates.
Peter also surveys the choices for picking a datatype for STATUS fields — “. . . ’new’, ‘confirmed’, ‘in production’, ’shipped’ status of job, message etc. People use variety of ways to handle them often without giving enough thought to the choice which can cause problems later.” VARCHAR(N), ENUM, TINYINT — which is right for the job?
Here on the Pythian Group Blog, Nicklas Westerlund demonstrates adding a unique constraint on a table with duplicate data.
Also here at home, Sheeri Cabral’s item on why you want to switch to MySQL 5.1. She argues that additions such as online ALTER TABLE, online table-based logging, and INFORMATION_SCHEMA.PROCESSLIST are compelling features worth an upgrade.







