test lab

Log-dedicated loop device throughput and time overhead on btrfs 4.x

This is again about real world numbers (which I like so much for being authentic ;-). The context being the throughput and time overhead of a loop device, as a poor or late man’s replacement for a real disk partition, jep I know, on copy-on-write filesystem btrfs, exclusively dedicated for logging, that is appending over and over. Why? The loop device may overflow with data without affecting the underlying filesystem, see Btrfs subvolume quota still in its infancy with btrfs version 4.2.2 for more why’s and what I tried to get btrfs subvolume with quota to work. By the way, about that though, see debian org Btrfs for a down-to-earth assessment of btrfs to date, even uttering a recommendation from what version number (4.4) to start off at the earliest near production. Anyway, what follows, adapts a test setup as in Performance of loopback filesystems, prime credits go there, and expands the layout somewhat for the btrfs C or nodatacow flag. Here we go.

Have this baseline, if you like, test on the raw iron. Well, its not raw iron really, its vmware and tons of storage below, and the shown performance is terrible I know and I only take one testset of bs / count but that won’t matter. There’s something to start off.

mkdir /tmp/loop0
dd if=/dev/zero bs=1M of=/tmp/loop0/file oflag=sync count=1000
1048576000 bytes (1.0 GB) copied, 8.9605 s, 117 MB/s
1048576000 bytes (1.0 GB) copied, 6.52867 s, 161 MB/s
1048576000 bytes (1.0 GB) copied, 5.35716 s, 196 MB/s
1048576000 bytes (1.0 GB) copied, 5.48745 s, 191 MB/s
1048576000 bytes (1.0 GB) copied, 5.14736 s, 204 MB/s


Btrfs subvolume quota still in its infancy with btrfs version 4.2.2

Ever tried to dive into the btrfs subvolume topic, especially in combination with quotas (not snapshots here)? Looks really promising… administring subvolumes in hierarchies automagically offers managing quotas on (summed up) top- and on (dedicated) sub-levels by design, see : Btrfs SysadminGuide Subvolumes or Btrfs: Subvolumes and snapshots for example. With the later 4.x kernels there is btrfs 4.2.2, representing a huge step forward in btrfs development, so thought to give it another try on a red hat / oracle uek based 7.2 system.

Following, I’m going to show what I attempted to achive, the how-to’s, the workarounds I tried and, intermixed, the quite odd behaviour that I observed. Odd to an magnitude, that makes me recommed everyone to stay averse against employing this promising but still semifinished (?) technology.

The setup of a subvolume, dedicated for quota control is quite easy and takes only a couple of keystrokes. Understand though, that quota control with btrfs can only be enabled throughout the entire filesystem but can then be set to a value dedicatetly.


(raw) Oracle Linux 6 memory foorprint with/out X11

Running quite a couple of (Linux) guests on a virtual host sooner or later raises the question of the (guests / host resources) ratio. That is, commonly for cpu, ram and i/o, how many guests will fit on that specific host for an average load.

Having this question nagging in my head, I was particularly curious to find out how much memory an OL6 will consume for a pure operating system installation with and without the convenience of running X11 (which is animated by Gnome 3.x in OL6, having most of the autostart apps removed – xfce will be much leaner but that’s another story).

The top Mem: used snapshots were taken immediately after a bounce of the guest to have as less application code inference as possible since Linux never frees memory iff not necessary. Here we go.


Resource allocation when importing a yago2 ontology into an oracle semantic database

This is a short review of the resources nessecary when importing a full yago2 ontology, which is about 200 mio triples, into an oracle semantic database. Some information and snippets about the way to execute the import is given but this is not the main focus of the article. I’m still on a fresh, not otherwise loaded 11gR2.0.1 database.

The way to execute the import mainly followed the instructions given with the rdf demos, that is using sqlldr and bulkload.ctl to populate a staging table as proposed in ORACLE_HOME\md\demo\network\rdf_demos and afterwards employ sem_apis.bulk_load_from_staging_table() to actually load the data into the sematic net. bulkload.ctl has in fact not being changed anyway, the yago2 data being supplied in nt triples formatting like this:

<Embeth_Davidtz> <http://yago-knowledge.org/resource/actedIn> <Army_of_Darkness> .

and the staging table:

create table yago2_stage (
  RDF$STC_sub varchar2(4000) not null,
  RDF$STC_pred varchar2(4000) not null,
  RDF$STC_obj varchar2(4000) not null,
  RDF$STC_sub_ext varchar2(64),
  RDF$STC_pred_ext varchar2(64),
  RDF$STC_obj_ext varchar2(64),
  RDF$STC_canon_ext varchar2(64)
) compress;

and the sql loader call:

sqlldr userid=lucene/*** control=yago2.ctl data=yago2_1.nt direct=true skip=0 load=95000000 discardmax=10000 bad=yago2.bad discard=yago2.dis log=yago2.log errors=0


Installing DBPrism (in-Oracle-Lucene) on a WinXP-10gR2


This post talks, shares experiences, about installing Marcelo Ochoa’s (MO, Marcelo Ochoa’s personal blog) in-oracle-lucene implementation into an oracle on winxp. MO did a really great job in porting a raw lucene to an oracle database. His essential trick is actually to represent the file system layer, lucene usually lives in, within a blob based storage as well as employing the oracle odci-interface. Any further information is available in the online documentation here Lucene Domain Index.


The latest version of the code stack is available as lucene-odi-bin- from http://sourceforge.net/projects/dbprism/files/odi/ Do not care about the sourceforge project being called DBPrism or this other code stack around being tagged ojvm, this is all history. The latest version to comprise the lucene 3.0.2 core base is lucene-odi, lucene-ojvm currently features the lucene 2.9.2 core base as just a maintainance release.

Another necessary download is ant, as a build and installation runtime environment. I sourced apache-ant-1.8.1-bin.zip from http://ant.apache.org/bindownload.cgi. Finally, iff you plan to compile the java code to windows dll’s for better performance (especially on production hosts), you may need to get Microsoft Visual C++ 2010 Express from http://www.microsoft.com/express/Downloads. Its free and will supply, beneath a lot of other stuff, what is expected, i.e. provides for a C++ compiler and dll linker.