[SOLVED] Xml or Sqlite, When to drop Xml for a Database?

Issue

I really like Xml for saving data, but when does sqlite/database become the better option? eg, when the xml has more than x items or is greater than y MB?

I am coding an rss reader and I believe I made the wrong choice in using xml over a sqlite database to store a cache of all the feeds items. There are some feeds which have an xml file of ~1mb after a month, another has over 700 items, while most only have ~30 items and are ~50kb in size after a several months.

I currently have no plans to implement a cap because I like to be able to search through everything.

So, my questions are:

  1. When is the overhead of sqlite/databases justified over using xml?
  2. Are the few large xml files justification enough for the database when there are a lot of small ones, though even the small ones will grow over time? (a long long time)

updated (more info)

Every time a feed is selected in the GUI I reload all the items from that feeds xml file.

I also need to modify the read/unread status which seems really hacky when I loop through all nodes in the xml to find the item and then set it to read/unread.

Solution

I basically agree with Mitchel, that this can be highly specific depending on what are you going to do with XML and SQLite. For your case (cache), it seems to me that using SQLite (or other embedded databases) makes more sense.

First I don’t really think that SQLite will need more overhead than XML. And I mean both development time overhead and runtime overhead. Only problem is that you have a dependence on SQLite library. But since you would need some library for XML anyway it doesn’t matter (I assume project is in C/C++).

Advantages of SQLite over XML:

  • everything in one file,
  • performance loss is lower than XML as cache gets bigger,
  • you can keep feed metadata separate from cache itself (other table), but accessible in the same way,
  • SQL is probably easier to work with than XPath for most people.

Disadvantages of SQLite:

  • can be problematic with multiple processes accessing same database (probably not your case),
  • you should know at least basic SQL. Unless there will be hundreds of thousands of items in cache, I don’t think you will need to optimize it much,
  • maybe in some way it can be more dangerous from security standpoint (SQL injection). On the other hand, you are not coding web app, so this should not happen.

Other things are on par for both solutions probably.

To sum it up, answers to your questions respectively:

  1. You will not know, unless you test your specific application with both back ends. Otherwise it’s always just a guess. Basic support for both caches should not be a problem to code. Then benchmark and compare.

  2. Because of the way XML files are organized, SQLite searches should always be faster (barring some corner cases where it doesn’t matter anyway because it’s blazingly fast). Speeding up searches in XML would require index database anyway, in your case that would mean having cache for cache, not a particularly good idea. But with SQLite you can have indexing as part of database.

Answered By – Stan

Answer Checked By – Mildred Charles (BugsFixing Admin)

Leave a Reply

Your email address will not be published.