Gurjeet [tech]: 2011

Replacing Google search on LinuxMint

I was sick of LinuxMint's hack on Firefox's search results page. And every time I fix it, the next update to Firefox replaces my fixed doc with the default one, which just sucks :(

So here's a script that I run after every update to Firefox:

cd /usr/share/linuxmint/common/artwork/firefox/
sudo wget http://mxr.mozilla.org/firefox/source/browser/locales/en-US/searchplugins/google.xml?raw=1 -O google.xml.fixed
sudo mv google.xml google.xml.orig
sudo mv google.xml.fixed google.xml

LVM: Reusing physical volumes from another machine

Here I wish to demo how to dismantle LVM setup on one machine and reassemble it on another without losing filesystem data stored on the Physical Volumes.

This may be useful in places where block devices can be unmounted from one machine and remounted on another, say to recover from hardware failure on the first machine. (I plan to use it on Amazon AWS' EBS volumes)

The key is to use the -'-zero n' option when reassembling Physical Volumes in lvcreate, and to use the same --size option as the done for the previous LVM setup.

A Volume Group stores its metadata in the Physical Volumes, and hence if you manage to reassemble all Physical Volumes of a Volume Group, you get the original Volume Group back. (Per responses of 'comps' on Freenode's #lvm channel, and confirmed by the testing below)

Now it is up to the Logical Volume creator to carve out volumes of the same size and from the same locations of this Volume Group.

# Legend:
# LVM = Logical Volume Manager 2
# PV = Physical Volume
# VG = Volume Group
# LV = Logical Volume
# FS = FileSystem

# Make a temp directory to play in
mkdir ~/lvm_test
cd ~/lvm_test

MINOR_NUMBER_START=100
MAX_DEVICES=8
for (( i = $MINOR_NUMBER_START ;
 i < $MINOR_NUMBER_START+$MAX_DEVICES ;
 ++i )); do
 # Make a few empty files
 dd if=/dev/zero of=dev$i bs=1024 count=20480

 # Make a few loop devices
 sudo mknod -m660 lo$i b 7 $i
 sudo losetup lo$i dev$i

 # Create LVM PVs using these loop devices
 sudo pvcreate lo$i
done

# Create a VG over all the PVs, and name it vg1
sudo vgcreate vg1 lo*

# Create a LV from the VG, and name it lv1
sudo lvcreate --name lv1 vg1 --size 100M

# Make EXT3 FS on LV with name lv1
sudo mkfs.ext3 `sudo lvdisplay -c vg1/lv1 | cut -d : -f 1`

# Make a directory to act as a mount point
mkdir dir

# Mount the LV on the directory
sudo mount `sudo lvdisplay -c vg1/lv1 | cut -d : -f 1` dir

# Create a 20 MB file filled with random data on this new FS
sudo dd if=/dev/urandom of=dir/dd.out bs=1024 count=20480

# Make a backup of the file
cp dir/dd.out ./dd.out


# Now we perform the test where we remove all the LVs and PVs from the VG and
# the VG itself, and then use the PVs to create a new VG and LV that can be
# mounted with the filesystem intact.

# Teardown the FS, LV and VG, but DONT remove the PVs
sudo umount dir
sudo lvchange --available n vg1/lv1
sudo lvremove vg1/lv1
sudo vgremove vg1

# Create a new VG using existing PVs, and name it vg_new
sudo vgcreate vg_new lo*

# Create a new LV on this VG, but DONT zero the first 1kB block; In essence,
# retain the data on Physical Volumes.
# Make sure to use the same size as previous LV (lv1), else mounting of EXT3 FS
# won't work.
sudo lvcreate --zero n --name lv_new vg_new --size 100M

# Mount the LV on the directory; Note that we did not format.mkfs on the LV.
sudo mount `sudo lvdisplay -c vg_new/lv_new | cut -d : -f 1` dir

# Compare the backed-up copy of del.dd with the one just mounted from the new
# Logical Volume
sudo diff del.dd dir/del.dd

# 
# Teardown
sudo umount dir
sudo lvchange --available n vg_new/lv_new
sudo lvremove vg_new/lv_new
sudo vgremove vg_new

for (( i = $MINOR_NUMBER_START ;
 i < $MINOR_NUMBER_START+$MAX_DEVICES ;
 ++i )); do
 sudo pvremove lo$i
 sudo losetup -d lo$i
done

rm -rf ~/lvm_test

unset MINOR_NUMBER_START
unset MAX_DEVICES

Real life use case of Postgres' LISTEN/NOTIFY

A Postgres noob asked me: what's the real time (sic) use of LISTEN and NOTIFY? And here's how the conversation went:

noob: what's the real time use of LISTEN and NOTIFY?

in postgresql

me: Well, that is a very useful facility.

Say you have two applications, one inserts some data and the other reads and acts on that data..

In a normal setup you would have the reading application querying the DB every few seconds to see if new data has come in..

This incurs overhead on the dataabase.

In Postgres you can use LISTEN/NOTIFY to achieve the same thing without having to 'poll' the database every few seconds.

The reading application can register its interest in new data by saying 'LISTEN newdata'

And the writing application, whenever it inserts new data, can send anotification, saying 'NOTIFY newdata'

At this point, Postgres will send a message to the reading application that 'newdata' has arrived, and now the reading application query the database to process newly inserted data.

I hope that explains it.

noob: thank you Gurjeet, that's really good

Happy Weekend

How to convert MTS files to AVI

Per my comments here on a blog post [1] I finally succeeded in converting MTS files (from my Sony HD handycam) to AVI files. Here's how to convert them en'masse:

I am using Linux Mint 10, and ffmepg version 0.6-4:0.6-2ubuntu6.1

.) Put all your MTS files in a directory, and cd into that directory.
.) Use ffmepg to find out the highest bitrate used by the MTS files:
    ls *.MTS | xargs -n 1 ffmpeg -i 2>&1 | grep bitrate | cut -d : -f 6 | sort -g | tail -n 1

    I got 13039 kb/s in my case.

.) Use ffmpeg to do the conversion:

    ls *.MTS | xargs -n 1 -I XXXX ffmpeg -i XXXX -b 13000k -ac 2 -ab 256k -deinterlace XXXX.AVI

    I used the 13000k value above because in the previous step we found out that the highest value is 13039 kb/s. This will allow us to keep the quality of our videos almost unaltered.

    I think that if the actual bitrate of a video is lower than this value, say 8000 kb/s, then ffmpeg will keep that bitrate instead of creating the AVI at 13000 kb/s bitrate, so the AVI filesizes will almost be the same as the MTS files.

    This step will result in additional files to be created in the directory where you are; for eg. if you had files a.MTS and b.MTS, this step will create files a.MTS.AVI and b.MTS.AVI.

[1] http://justplainobvious.blogspot.com/2010/06/how-to-convert-mts-to-avi-in-linux.html

Pagination of query results in Postgres

Marko's recent blog post on speeding up count(*) in Postgres sent me to his previous post where I saw a very valid reason from Darren Duncan for pagination of results from a database. The point being that web applications are usually expected to show the page links and allow the user to jump to different pages in the result.

One solution is to run the query twice, once with just a count(*) and then the actual query, but that'd be very inefficient.

A few experiments later I think I have a solution; a bit inefficient than running the plain query but much better than running the query twice.

with
REAL_QUERY as
 (select relname, oid from pg_class limit 5)
-- User should ignore the rest of the query
select *, show_rowcounter() as row_count
from (select REAL_QUERY.*, increment_rowcounter() as row_number
      from REAL_QUERY,
      (select reset_rowcounter()) as _c
      order by row_number
      offset 0) as _subq;

Don't get scared by the length of the query above, all one needs to do is put the real query between the first pair of parenthesis after the REAL_QUERY identifier.

The result would be the same as the original query, but with two additional columns: row_number and row_count. row_number numbers each row starting from 1, and row_count shows the total number of rows in the result.

.   relname    | oid  | row_number | row_count 
---------------+------+------------+-----------
 pg_statistic  | 2619 |          1 |         5
 pg_type       | 1247 |          2 |         5
 pg_attribute  | 1249 |          3 |         5
 pg_toast_1262 | 2844 |          4 |         5
 pg_toast_2604 | 2830 |          5 |         5
(5 rows)

The 'ORDER BY row_number' clause adds the overhead, but it is necessary so that the row_count is calculated before the first row is produced at the top level. I wish I could introduce a MATERIALIZE node in the query plan at will.

If your REAL_QUERY has an ORDER BY or GROUP BY clause then you can remove the ORDER BY row_number clause from the outer query, since Postgres will make sure that show_rowcounter() will not be called before the last call of increment_rowcounter().

The above trick uses the Common-Table-Expression feature (introduced in Postgres version 8.4), because I wanted to make it look like a template where user's real query is visible at the top.

If you are running on an older version you can easily modify it to be a simple query because CTE used above is not recursive.

select *, show_rowcounter() as row_count
from (select REAL_QUERY.*, increment_rowcounter() as row_number
      from (select relname, oid from pg_class limit 5) as REAL_QUERY,
      (select reset_rowcounter()) as _c
      order by row_number
      offset 0) as _subq;

Now the guts of the trickery: PL/Perl functions:

-- PL/Perl
create or replace function reset_rowcounter() returns int as $p$
    $_SHARED{rowcounter} = 0;
$p$ language plperl stable;

create or replace function increment_rowcounter() returns int as $p$
    $_SHARED{rowcounter} = $_SHARED{rowcounter} + 1;
    return $_SHARED{rowcounter};
$p$ language plperl;

create or replace function show_rowcounter() returns int as $p$
    return $_SHARED{rowcounter};
$p$ language plperl;

BTW, this also shows how one can get Oracle's ROWNUM like feature in Postgres.

Enabling right-click in GMail

For the record...

I recently switched to Linux Mint, and ever since then I right-clicks in GMail on my firefox had stopped working.

Disabling all the labs feature fixed the problem, so I started hunting for the specific offending labs feature.

In the end it turned out to be the 'Mouse gestures' feature. Since I don't use that feature at all, it was easy to just turn it off, and voila, my Firefox+GMail is back to normalcy.