Thanks for helping us identify and fix bottlenecks in our application. Things are now running smoother than ever.
Open Query blog
Ken Jacobs leaves Oracle
Matt Asay writes today in Oracle loses some MySQL mojo about Ken Jacobs leaving Oracle. For me, that’s a major bummer. Ken has been a long-time visitor of the MySQL Conference and that’s where I first met him: a friendly and knowledgeable person, on database technology in general but also about MySQL. When Innobase Oy got bought by Oracle, InnoDB got placed under Ken’s leadership and did pretty well there. We’d occasionally exchange emails, and I’ve always found him to be responsive and helpful.
I think it was kinda presumed by people that the technical part of MySQL at Oracle would also reside with Ken. Obviously now, that’s not going to be the case. What that means exactly, I don’t know as I am not familiar with the other person (Edward Screven). We’ve got to know Ken over the years, so it would’ve been nice to keep going with him. Ohwell.
Now we’ll see what Edward does with it all, and how he will interact with the MySQL community. And I wonder what new adventures Ken might be off to, if any?
Friendlist Graph Module for Drupal
At DrupalSouth 2010 (Wellington) after LCA2010, Peter and I implemented a Drupal module as a practical example of how the OQGRAPH engine can be used to enable social networking trickery in any website. The friendlist_graph module (available from GitHub) extends friendlist, which implements basic functionality of friends (2-way) and fans (1-way) for Drupal users.
The friendlist_graph module transposes the friendlist data using an OQGRAPH table, allowing you to query it in new and interesting ways. By adding some extra Drupal Views, it allows you to play Six Degrees of Kevin Bacon with your Drupal users or find out how two arbitrary users are connected. It can find a path of arbitrary length near-instantly. Previously, you’d just avoid doing any such thing as it’s somewhere between impossible/limited/slow/painful in a regular relational schema.
Now think beyond: retrieve/share connections using Open Social, FOAF, Twitter/Identi.ca, logins with OpenID, and you “instantly” get a very functional social networking enabled site that does not rely on localised critical mass!
We tested with about a million users in Drupal (and approx 3.5 million random connections), which worked fine – the later demo at the DrupalSouth stuffed up because I hadn’t given the demo VM sufficient memory.
Naturally, you could do the same in Joomla! or another CMS or any site for that matter, we just happened to be at DrupalSouth so a Drupal module was the obvious choice. Take a peek at the code, it’s pretty trivial. Just make sure you run a version of MySQL that has the OQGRAPH engine, for instance 5.0.87-d10 (Sail edition!) from OurDelta.
Petition for MySQL consideration in Oracle+Sun merger
MySQL requires special consideration in the Oracle+Sun merger, otherwise both Oracle and MySQL users and vendors will literally pay the price. If you agree, please sign this petition now.
To be very clear, Open Query is in favour of the merger, we feel that overall it’s a good fit. We would also like to see it happen quickly, as obviously this is best for Sun employees and clients, as well as Oracle’s broad business prospects. Read more
Know your data – and your numeric types.
In this ”Good Practice/Bad Practice” post I hope to give some guidelines to choosing between MySQL’s numeric types, using longitude and latitude as a modelling example. (Disclaimer: I am not a mathematician, and the generalisations here are meant to help with practical modelling questions rather than be rigorously theoretical.)
Numeric types in MySQL fall into two main varieties:
- “precise” types such as INTEGER and DECIMAL;
- the IEEE-standard floating point types FLOAT and DOUBLE.
As a rule of thumb, the first group are for exact, or “counted” quantities. The INTEGER types represent whole numbers, and DECIMAL represents “fixed point” decimal, with a preset number of places after the decimal point.
Various widths of INTEGER are available in MySQL, from 8-bit TINYINT to 64-bit BIGINT. Calculations with integer types are fast, as they usually correspond to hardware register sizes.
DECIMAL is commonly used for quantities like decimal currency where the number of digits of precision is known and fixed. For example, exactly counting pennies in two decimal digits. Computation with DECIMAL is slower than other types, but this is unlikely to impact most applications.
In the other category are FLOAT and DOUBLE, which are the 32 and 64-bit IEEE standard types, which are usually supported in hardware and are therefore fast and convenient for arithmetic. These are generally good choices for “measurements” – values with limited precision.
It is important to understand what is meant by “precision” (Wikipedia has a full discussion). For example, I can measure my height at 185.8 centimetres. Because of the way I make the measurement, which we know to be approximate, this figure is understood have only one meaningful digit after the decimal point.
Precision is a property of all inexact real world “measurements” – such as position, length, weight, brightness; it is usually expressed as the number of “significant figures” or significant digits. (My height measurement has four significant digits.) This should be considered when values are displayed. It is somewhat misleading to represent my height as 185.8000 – common sense tells us that the value is not this accurate.
Serious problems can occur when we do not know the actual precision of measurements, if we wrongly assume a greater precision than exists. A typical example might be a GPS map display which uses measured position to locate the user relative to features such as rivers, roads, and railway tracks. The map display is high resolution and implies a great deal of precision to the viewer. Let us say that based on incoming data, it places our vehicle three metres East of a river. If the measurement has a true precision of, say, 20 metres, we cannot even know even which side of the river we are on! So to allow users to draw safe conclusions, the presentation of data needs to take precision into account.
(While very frequently cited as examples of “precise” figures, not all monetary values are. For example, the value of USA’s national deficit was estimated today at $11,983,250,643,192.95. I am no economist, but rather few of these digits can be actually significant!)
Throwing away precision with inappropriate modelling is also a potential problem. Ideally measurements arrive with an implied or explicit precision. But even without that, we need to assure ourselves that we are safely storing them.
Keep in mind that we are always dealing with two kinds of precision:
- Machine precision, which is defined by the chosen type; and
- Data precision, which is a property of the values we are storing.
If I am storing my height as cm in a FLOAT column, the data precision is only 4 significant digits as discussed, but the machine precision of this column is always about 7 significant digits, no matter what we try to store. Clearly to avoid throwing away significant parts of your input, the machine precision should exceed your data precision.
Consider a CSV file of latitude and longitude, and notice that every value has 10 digits after the point:
Afton Station,NS,45.6051050000,-61.6974950000 Agassiz,BC,49.2421627750,-121.7496169988This does not mean that every value has 13 digits of precision – if that were so, then these coordinates would be accurate to within 4mm on the ground! This is clearly not possible.
Let’s say we merely want to stick markers on a web page showing a national map. One hundred metre resolution would be more than adequate. Given an Earth circumference of 40,075,020 metres, 100 metres is approximately 1/1113 of a degree. While three digits (0.001) can represent 1/1000ths of a degree, this is not quite precise enough, so let’s keep four fractional digits after the point. Therefore we are looking to represent 7 significant figures, for example:
Afton Station,NS,45.60511,-61.69750Which of FLOAT or DOUBLE is the right type to use for such values? Let’s investigate.
A key difference between floating point types and fixed point representations (such as DECIMAL) is that while the overall binary precision is fixed, the precision of the fractional part of the value can vary! The available precision depends on magnitude of the value. The highest precision is available for values closest to zero, and precision gets worse as numbers increase in magnitude (by every power of two).
To understand the effect of this, it is necessary to examine floating point representation in more detail. (I am going to handwave more esoteric features of the IEEE standard and try to give a general picture. In particular I am not going to talk about rounding, biased exponents, or denormalised numbers.)
Floating point values have three parts:
- a sign (+/-)
- an “exponent” (binary scale factor)
- the value itself (known as the fraction or “mantissa”).
These are analogous to decimal “scientific” or “exponential” notation that you may already be familiar with (e.g. 76.4935 = 7.64935 x 10^1 or 7.64935E+1, where 1 is the decimal exponent).
The combination of these fields precisely defines a rational number, whose value is desirably “close enough” to the value you need to store (which was imprecise to begin with, so the approximation involved in converting to floating point representation is normally not a problem).
The overall precision available is determined by the bits allowed to store the fraction. For reference,
- FLOAT allows 23 physical bits for the fraction (with hidden bit, effectively 24 bits of fraction)
- DOUBLE allows 52 physical bits for the fraction (with hidden bit, effectively 53)
Respectively, this is 7 and 15 precise decimal digits in the whole figure. So FLOAT probably meets our 7 digit requirement.
To make sure, let’s work backwards and confirm just how precisely a FLOAT value can represent a latitude. To do this I will show how an example value is converted into the FLOAT representation. None of the values in our table will have latitudes greater than 77 degrees, so I will pick 76.4935. (Higher values have larger exponents and hence the least available precision for the fractional part, so are the safest test.)
First we need to determine the correct exponent for the value. Then we can work out the real-world “resolution” of the number, i.e. how much the actual value changes if we change the fraction by the smallest possible amount (i.e., in its least significant bit).
The exponent is the largest power of 2 that divides the value, to “normalise” it into a value between 0 and 1. Since our starting value is more than 1, we need to look at divisors among the powers of 2, which are greater than one: 1, 2, 4, 8, 16, 32, 64, 128, 256… A glance at this list shows us that for 76, a divisor of 128 (2^7) is the right one; i.e., the floating point exponent is 7. And the resulting fraction part is 76.4935 / 128, or in decimal, 0.59760546875.
(Normalisation works similarly for starting values less than one, but in the other direction. We multiply by the largest power of 2 that leaves the value less than one, and store the negative exponent. Negative values are simply dealt with by converting to positive before normalising, and noting a negative sign.)
Since all non-zero positive numbers begin with binary ‘1′, IEEE representation cleverly implies this “hidden” 1 bit, and doesn’t physically store it. This frees up one more bit for the fraction, i.e. giving a total of 24 bits for FLOAT precision. (Because of this, and “exponent biasing”, the binary sequence shown isn’t the actual IEEE bit pattern representing this number.)
Assuming a fraction of 24 bits, let’s examine its “binary expansion”. This is effectively just the result of 0.59760546875 x 2^24, written out in base 2:
1001100 . 01111110010101100 (total 24 bits including "hidden" bit) ^^^^^^^ ^^^^^^^^^^^^^^^^^ whole # fraction part = 76 . 4935 (approx)Because the exponent is 7, the first 7 bits in my binary sequence above are the whole number part (= binary 1001100 = 76). I’ve put a gap where the “binary point” belongs. Written out like this, we can see that we have 17 bits to the right of this point. 2^17 = 131072; so, around this magnitude of 76 degrees (and up to 128, as at 128 the binary exponent increases to 8), we can resolve no worse than 1/131072th of a degree. This is enough bits for five decimal digits (which only requires 1/100000th).
So how precise on the ground will this be?
1/131072th of a degree of latitude is about 0.85 metres: much better than our hoped for resolution of 100 metres. So we have shown that FLOAT is more than adequate for the job. This is not to say that FLOAT is the correct choice for all geographic uses; only that it is adequate for this use, where we decided 100m resolution was enough. On the other hand, source data for geocoding may need more precision than FLOAT can deliver. (Naturally the extra precision must be present in the source data. Simply using a more precise type cannot add any precision to the original measurement, of course
The analysis above can be done for DOUBLE, of course. For interest’s sake, the equivalent binary expansion is:
1001100 . 0111111001010110000001000001100010010011011101 (total 53 bits including "hidden" bit)We have 29 more bits to play with, or a total of 46 fraction bits after the whole number part, at this magnitude. This is ridiculously precise, and can resolve no worse than 1/70368744177664th of a degree; or 0.0000015mm on the surface of the globe. (This is enough to represent 13 decimal digits after the point.)
This example has shown how knowing a little of how floating point works can help you be confident about issues of precision, when choosing types to represent approximate values, or measurements – rather than automatically falling back on DOUBLE or even DECIMAL as a “paranoid default”. The key is to know your data, and understand how much precision you have, and how much your application needs.
The Future of MySQL (EU Crunch Time)
You’ve probably seen Monty’s post Help Saving MySQL. This is about
- Development (will Oracle put significant effort into MySQL, actually innovating)
- Brand (”MySQL” has a huge footprint), the trademark owner can enforce this – there have already been issues with companies offering MySQL related services via Google AdWords not being able to use the word MySQL in their ad text even though it was correctly used as an adjective.
- Forking is fine, but still has to deal with the branding. For MySQL, that’s possibly the most significant issue of any OSS product ever encountered. You’re not competing against a company, but against an existing brand footprint that you (because of the trademark) have to steer clear of. So “just fork it” is not an easy or short term option, there’s more involved than technical/development work.
- Code IP – to some degree (IMHO less important), it’s the thing that enables dual licensing. I regard dual licensing as a pest that’s best got rid of.
The really important thing to realise is that this is not about “killing Sun to save MySQL”, or “sending the right message to investors”. The former is merely a consequence of Oracle’s unwillingness to discuss any other option (whether rightfully or not, that’s just a fact) and the latter has no direct bearing on what’s right for either MySQL or Oracle – it’s definitely a factor that the investor world may consider, but it wouldn’t be a consideration for the EU.
With all that noted… please look at Monty’s post, he provides options and links to for you to action whichever way forward you feel is appropriate, whether for or against or neutral towards Oracle being able to take over Sun with MySQL in unmodified fashion. I think it’s good for more users (essentially interested parties) to express their opinion, since Oracle has managed to mobilise its own customers to flood the EU with their angle. While valid, the result ends up being a tad one-sided!
As I wrote on my comment/update on the Possible Movement in the Oracle/Sun/MySQL/EU Case, it’s unfortunate that the rumour suggesting that Oracle was willing to have MySQL as a separate business entity turned out to not be true, as I reckon it would have been a useful outcome for both Oracle and MySQL. A company can’t/won’t disrupt itself, and there are serious business-related “conflicts” to deal with if a single company sells both both products. Corporate structures and sales will always make decisions to steer away from competing with itself, and generally choose the most profitable road. Which one of the two that is in this case is not relevant, my take is that in the market both Oracle and MySQL have their place, so having either one lose out would not be good.
Irrespective of good intentions, companies do abide by certain rules – well actually many companies are ignorant of them and waste tons of money essentially trying to defy gravity. In any case, for me the issue is not with Oracle having good intentions or mistrusting that, the issue is that not even Oracle can defy gravity. The effort will go where the money is.
Remember what I quoted long ago about IBM and the PC? (Innovator’s Dilemma – Clayton Christensen), IBM planted the new department in another state with its own management and finances, because they knew that in the corporate/management decisions, inevitably the existing mainframe business would win and thus prevent any cannibalisation (from within) of its position. In a nutshell, a company can’t disrupt itself. It’s well documented. I think that overall, the Oracle/Sun deal is a good match. But also, I think MySQL needs to be handled properly to make sure that both MySQL and Oracle (the db product) will thrive in the future. I feel that’s what’s important.
Possible movement in the Oracle/Sun/MySQL/EU case
From NY Post: Oracle Leader Blinks – Larry’s Olive Branch (to the EU), the NYpost sources apparently say that “what [...] Ellison is proposing is the creation of a firewall between MySQL and the rest of the combined company, and possibly setting up an entirely separate board for the MySQL business.”
There is no independent confirmation of any of this, so it may be true, or just air, or a trial balloon to see how other parties respond… I’m not going to add opinions to this, I just reckon it’s an interesting progression in the case. We’ll see how it pans out.
Update: so it’s not true (see Reuters).
(so now I’ll add my opinion…) Unfortunate in a way because from my perspective it would have actually been a useful outcome for both Oracle and MySQL. A company can’t/won’t disrupt itself, and there are serious business-related “conflicts” to deal with if a single company sells both. Corporate structures will always make decisions to steer away from competing with itself, and go for the most profitable road. Which one of the two that is in this case is not relevant, the point is that in the market both Oracle and MySQL have their place, so having either one lose out would not be good.
On SQL vs No-SQL
The No-SQL tag really lumps together a lot of concepts that are in fact as distinct from eachother as they are from SQL/RDBMS.
An object store is not at all similar to Cassandra and Hypertable, which is not at all like an column store. And when looking at BigTable derivatives, it’s quite important to realise that Google actually does joins in middle layers or apps, so while BigTable does not have joins, the apps essentially do use them – I’ve heard it professed that denormalising everything might be a fab idea, but I don’t quite believe in that for all cases, just like I don’t believe in ditching the structured form of RDBMS being the solution.
SQL/RDBMS has had a few decades of dominance now, and has thus become the great “general purpose” tool. With the ascent of all the other tools, it’s definitely worthwhile to look at them, but also realise that each (inluding SQL based ones) have their place. Moving all your stuff wholesale from one to the other is probably a fail.
At the recent OpenSQL Camp in Portland, Brian Aker did a short (7 minute) talk, covering some of these aspects, with a humerous angle. It’s educational, and fun!
OQGRAPH at OpenSQL Camp 2009, Portland
Antony is travelling up to Portland for this great event that’s about to start Fri evening and going over the weekend. He’ll be showing other devs and people more about the OQGRAPH engine, and gathering useful feedback.
Open Query is, together with many others (I see Giuseppe, Facebook, Gear6, Google, Infobright, Jeremy Cole, PrimeBase Technologies, Percona, Monty Program, and lots more), sponsoring the event so that it’s accessible for everybody – reducing the key factor to getting there rather than having to worry about high conf fees.
Having acquired the world’s biggest jetlag flying to Charlottesville VA for last year’s OpenSQL Camp, I can confirm from personal experience that it’s a great event. While I can’t be there this time, I’m looking forward to hearing all about it!
OQGRAPH update: speed, maze example, 5.0 packages
Antony has done a bit of magic, considerably speeding up inserts. Since the base implementation does not have persistence, insert speed is particularly important. Copying the 2×89,051 edges for the Tree-of-Life example is now near-instant.
The delete bug has been fixed.
There’s a new Maze example in the OQGRAPH trunk on Launchpad, first introduced in my MySQL University session. I created/inserted a maze of 1 million rooms (that comes to about 3 million edges), and OQGRAPH found the shortest path (122330 steps for this particular maze) in abound one second. That’s pretty good, I think!
Last but not least, the OurDelta builds of MySQL 5.0.87-d10 have been published (for all Debian, Ubuntu, CentOS/RHEL, generic) and the -Sail edition of the packages have OQGRAPH built-in. So if you use 5.0 or just want to play, it’s now very easy to get started!
Earlier in the week we received a message from an early OQGRAPH adopter, telling how he’s using it to manage paths in his IP network: calculations that would previously require many minutes are now completed in a fraction of a second and a single query. He admitted to be pretty much an all Oracle shop, with this OQGRAPH app being his first exploration of MySQL space. He loves OQGRAPH, and I suppose that by proxy implies he likes MySQL too
OQGRAPH session on MySQL University – recording now available
It was fun doing the MySQL University session on OQGRAPH yesterday. Now also available: slides (PDF) and audio/video recording (FLV download, if anyone can convert to a more open format, that’d be great).
The search for MySQL 5.5
So, MySQL 6.0 was ditched, and a few weeks ago 5.4 was also – its features to be added in other (earlier) versions (I’m told 5.2 but not sure). I reckon that’s good news, regardless of the version number. There was also an announcement about a change in the release mechanism at Sun/MySQL.
Now for practicals. If I look on Launchpad, the 5.1 branch is the only active one (next to 5.0 fixes, of course). 5.4 was last updated 15 weeks ago. There is no 5.2 on there that I can find. Wasn’t looking for it really, just happened to notice its absence while I was trying to find 5.5. And the reason for that was that Miguel closed a bug I was following, noting it was no longer reproducible in 5.5. He pastes some code that reports mysql as 5.5, so it’s not a typo.
So, in addition to the above list of abandonment (5.4, 6.0), we have 5.2 which I’m told should exist but doesn’t at Launchpad, and 5.5 which appears to exist and is news to me yet doesn’t appear to be out there either. Are you confused? I am.
The particular bug was found during a training session and occurs on Windows. Now the bug is closed, but we can’t see code and have no indication when it or binaries will be available. So what do I tell a user asking about the bug and its apparent fix? (I have to say apparent because Miguel’s response indicate that it’s merely not reproducible on the later version, there’s no specific fix)
Updates
- Vladislav Vainroub notes there’s a mysql-next-mr branch on launchpad which is in fact version 5.5 inside. It appears to be mirrored, last sync 5 minutes ago but last changeset 39 hours ago. So this seems like a publishing branch, not a development branch (otherwise we’d see more activity).
- Paul DuBois tells that the mysql-server trunk on launchpad is now 5.5. Last activity is from a week ago, so I presume that like the abovementioned mysql-next-mr branch it’s synced and not actually from a live development branch. Pity.
OQGRAPH engine on MySQL University – 5 Nov 2009 10:00 UTC
Only a few weeks after Walter’s session on Multi-Master Replication with MMM and thanks to the great gang at MySQL Docs (my colleagues from long ago!) I’ll be doing a MySQL University session in a few days, about the GRAPH computation engine. From talks/demos I’ve done about it so far, I’ve learnt that people love it but there are lots of interesting questions. After all, it’s a pretty new and in a way exotic thing.
MySQL University uses DimDim, an online presentation service. You’ll see slides, and hear my voice. You can also type questions in a live chat room. We actually even got desktop sharing working so a live demo is possible, we’ll see how that goes on the day (I’ll make sure to have static slides for the same also
For session details and the exact link to DimDim, see the MySQL uni page for the OQGRAPH session.
To attend, please calculate the starting time for your local timezone! It’ll be very early in the morning for US people, however for Europe it will be late morning, and Asia/Pacific will be evening. If you miss the live session, there’ll be a recording online soon afterwards and of course you can contact me for questions anyway. Still, it would be be cool if lots of people attended live, that’s always extra useful. Hope to meet you there!
MariaDB 5.1 packages for Debian/Ubuntu
See the OurDelta blog for details of this release. RHEL/CentOS packages also coming.
thread_stack_size in my.cnf
Many configs have thread_stack_size configured explicitly, but that can cause rather bad trouble:
- if the stack inside a thread it’s too small, you can get segfault crashes (stack overflow, essentially). Particularly on 64-bit.
- if the stack is too large, your system cannot handle as many connections since it all eats RAM.
Let mysqld sort it out, on startup it does a calculation based on the CPU architecture, and that’s actually the most sensible. So for almost all setups, remove any thread_stack_size=… line you might have in my.cnf.
Trivia: identify this replication failure
We got good responses to the “identify this query profile” question. Indeed it indicates an SQL injection attack. Obviously a code problem, but you must also think about “what can we do right now to stop this”. See the responses and my last note on it below the original post.
Got a new one for you!
You find a system with broken replication, could be a slave or one in a dual master setup. the IO thread is still running. but the SQL thread is not and the last error is (yes the error string is exactly this, very long – sorry I did not paste this string into the original post – updated later):
“Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted (you can check this by running ‘mysqlbinlog’ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.”
In other similar cases the error message is about something else but the query it shows with it makes no sense. To me, that essentially says the same as the above.
The server appears to have been restarted recently.
What’s wrong, and what’s your quickest way to get replication going again given this state?
Trivia: identify this query profile
You do SHOW PROCESSLIST, and you see one of your web apps issue the following query:
SELECT ... WHERE ... AND 1=2 UNION SELECT ...What does this tell you, and what do you do next?
Mmm, what an interesting week
I have been very busy here in Malaysia this week. On thursday, I was asked to do a MySQL University session on MMM. The preparation was very stressful. There was no good wifi to be found until literally a few hours before the session (Big thank you to Gurdip at APIIT for providing a space and exceptional help!). On top of that, dimdim, the software used by MySQL for their sessions doesn’t seem to want to work on Linux (particularly the speaker part). I ended up using a laptop borrowed from APIIT with Vista and IE. Feels kind of counter-intuitive for a company in the FOSS business.
The session went very well and here is the resulting recording of the MMM talk on the mysqlforge page.
But that wasn’t the end of the MMM-promotion week: I happened to be allowed to present at the foss.my conference in Kuala Lumpur pretty last minute. At first I was going to do an updated version of the talk I gave at Froscon in August, but I was asked to do a tutorial session of 3 hours. I had never done anything like that, but I am always up for a challenge
Again, preparation was a bit stressful. I didn’t know how many people to expect and it wasn’t clear if I would achieve getting running MMM clusters up in 3 hours. Well, I was underestimating my own capabilities apparently. Almost 100 people showed up, most of them without a laptop. I was surprised at that and explained them that it was probably not going to be so interesting for them. Again, I was wrong. While the laptop-owners prepared their laptops, I used my time to explain to everyone what MMM is, and how it works. Then we set up the laptops, solving all the problems we met on the beamer that we had a user connected to.
In the end we managed to set up 2 clusters within exactly 3 hours. Only 6 (almost 7) ’servers’ participated in that end-result, for various reasons the rest was not possible. Still, it was a very good result and the attendees were visibly very happy.
If you hadn’t noticed yet, I’m a big fanboy for MMM and thinks this project needs/deserves a lot more visibility. It really solves a bunch of problems many MySQL admins struggle with, while providing a simple, cheap HA solution. This week has been very good for the promotion of MMM.
Along the way I also discovered that I really love doing this workshop and I hope to do many more like this. On that note: if you know of any conferences or meetings in the Asia Pacific area in the upcoming months, let me know and I’ll try to be there with either a presentation or a workshop!
OQGRAPH on Launchpad, graph examples
The MySQL 5.0 and MySQL/MariaDB 5.1 source code is now also available through Launchpad. If you were waiting for a version for 5.1 and are ok with building the plugin from source, now you can!
The repo contains a subdir for examples, we’re hoping many people will contribute little snippets and scripts to import and use interesting datasets. To give you a hint, with graph capabilities you are able to deal with RDF data sources. You just need to transform the XML to say CSV, import into a suitable structure, and copy the edge information across to an OQGRAPH table.
Roland Bouman’s tree-of-life (which uses xslt stylesheets) are a good example of that approach, and was the first entry in the examples tree, including an SQL dump of the base dataset (it was CC-NC licensed) so you don’t necessarily have to fuss with the RDF/xslt foo.
Enjoy! We want to have examples/demos, a proper testsuite (there’s a bug/wishlist for that), and more. If you can help, please do: mucking around with graphs is great fun. If you implement OQGRAPH in a “proper” app, we’d also like to hear from you. The examples are intended to get people used to what OQGRAPH can do, and thus trigger ideas for practical uses. It’s not just fun. With OQGRAPH’s capabilities and speed, you can profit.
GRAPH Engine Linux binaries in MySQL 5.0.86-d10 available now
At this point we have a 32-bit and a 64-bit Linux binary tarball, should work on most Ubuntu and CentOS and the like (I tested a few). Possibly OSX coming. Not sure on Windows right now.
For further details and download links, see yesterday’s release post.
GRAPH Engine source in MySQL 5.0.86-d10 available now
It’s time to play! A special thanks particularly to Antony Curtis for the excellent smart and actually very speedy coding, and for just being a great guy to work with. If you would like to utilise his ace MySQL knowledge and coding skills, do talk to me!
Right now, we have a source tarball available for you, patching OQGRAPH on top of a MySQL 5.0.86-d9-Sail (OurDelta) source. As you know MySQL 5.0 does not have engine plugins so patching is the only way we can put it in. This OQGRAPH codebase is licensed under GPLv2+.
Even though we’ve successfully built it on several platforms and architectures, since this is the first public release we’d like you to try it first, as we’re sure that there might be problems on some platforms. When we catch and fix those, we can do proper package builds.
You will find the link to the source tarball, and other necessarily instruction and configuration, on the documentation page. It’s tempting to skim through it and just start playing, but I recommend you really read through first: this engine is quite different. Please explore, and tell us what you think!
- http://openquery.com/graph/doc – OQGRAPH preliminary documentation
- http://openquery.com/forum/oqgraph – public forum
- https://bugs.launchpad.net/oqgraph – bug tracking, feature requests
- http://openquery.com/graph – product information, support and engineering options
To contact Open Query directly about the GRAPH engine, email g r a p h (at) o p e n q u e r y (dot) c o m


