2013-08-24

Real-time console based monitoring of PostgreSQL databases (pg_view)

In many cases, it is important to be able to keep your hand on the pulse of your database in real-time. For example when you are running a big migration task that can introduce some unexpected locks, or when you are trying to understand how the current long running query is influencing your IO subsystem.

For a long time I was using a very simple bash alias that was injected from the .bashrc script and that included the calls to system utilities like watch, iostat, uptime, df, some additional statistics from the /proc/meminfo and psql that was extracting information about currently running queries and if that queries are waiting for a lock. But this approach had several disadvantages. In many cases I was interested in the disk read/write information for query processes or PostgreSQL system processes, like WAL and archive writers. Also I wanted to have a really easy way to notice the queries that are waiting for locks and probably highlight them by color.
Several weeks ago we finally open-sourced our new tool, that makes our lives much easier. That tool combines all the feature requests that I was dreaming of for a long time. Here it is: pg_view.

I already have some more feature requests actually and hope that Alexey will find some time to add them to the tool in nearest future. So if somebody wants to contribute or give some more ideas, please comment and open feature requests on the github page :)

2012-01-22

Schema based versioning and deployment for PostgreSQL

I am one of the supporters of keeping as much business logic in the database itself. This reduces the access layer of the application to mostly dumb transport and data transformation logic that can be implemented using different technologies and frameworks, without the need to re-implement critical data consistency and data distribution logic in several places, and gives an easy possibility to control what you are doing with your data and how your applications are allowed to access this data or even change exchange the underlying data structures transparently from the upper level of the code and without the need of a downtime. It also gives a possibility to add an additional layer of security allowing access to the data only through stored procedures, that can change their security execution context as needed (SECURITY DEFINER feature of PostgreSQL).

This approach has some disadvantages of course. One of the biggest technical problems, that is very easily becoming an organizational problem if you have a relatively big teem of developers, a problem of how to rapidly rollout new features without touching old functioning stored procedures, so that old versions of your upper level applications can still access the previous versions of stored procedures, and newly rolled out nodes with new software stack on them, access new stored procedures doing something more, or less, or returning some other data sets compared to their previous versions. And of course hundreds of stored procedures that are there to access and manipulate data are enough to make any attempt to keep all new versions of them backwards compatible, a nightmare.

Classical way to do this, would be to keep all the changes backwards compatible and if it is not possible, then create a new version of a stored procedure with some version suffix like _v2, mark the previous version as deprecated and after all your software stack is rolled out to use that new function, just drop the previous version.  But if you are rolling out new version of the whole stack once of twice a week, the control of what is used and that is not becomes quite a challenge... and discipline of all the developers should be really good as well. Stored procedures are not the only objects, that are changing together with them.  The return or input types can change as well. Changing of a return type, that is used by more then 2 stored procedures in a backwards compatible fashion is a pure horror if you want to do it without creating a new version of such a type and new versions of all the stored procedures, that use it. Dependency control becomes another problem.

My solution to that problem was to introduce a schema based versioning of PostgreSQL stored procedures. It uses an idea of PostgreSQL schema and search_path for a session.

So all the stored procedures, that are exposed to the client software stack, are grouped in one API schema that contains only stored procedures and types needed by them.

Schema name contains a version in it, like proj_api_vX_Y_Z, where X_Y_Z is a version, that a software stack is targeted to. Software stack does SET search_path to proj_api_vX_Y_Z, public; immedeately after it gets a connection from the pool and all calls to the stored procedures are done without explicitly specifying a schema name for that API stored procedure and PostgreSQL finds the needed stored procedure from the specified schema.

So when a branch is stable and branch version is fixed, it is used as a property that will be used when setting the default search_path for the software, that is being deployed for that branch. For example in Java using BoneCP JDBC Pool, setting an initSQL property of all the pools used to access proj database.

We are storing the sources of all the stored procedures (and other database objects) in a special database directory structure that is checked in into a usual SCM system. All the files sorted in corresponding folders and are prefixed with a 2 digit numeric prefix to ensure the order of sorting (good old BASIC times :) ). Like
50_proj_api
00_create_schema.sql
20_types
20_simple_object_input_type.sql
30_stored_procedures
20_get_object.sql
20_set_object.sql

Here 00_create_schema.sql file is containing CREATE SCHEMA proj_api; statement, statements to set default security options for newly created stored procedures and a SET search_path TO proj_api, public; statement, that ensures, that all the objects, that are coming after that file are injected into the correct API schema. An example of 00_create_schema.sql file can look like:
RESET role;

CREATE SCHEMA proj_api AUTHORIZATION proj_api_owner;

ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA proj_api REVOKE EXECUTE ON FUNCTIONS FROM public;

GRANT USAGE ON SCHEMA proj_api TO proj_api_usage;

ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA proj_api GRANT EXECUTE ON FUNCTIONS TO proj_api_executor;
ALTER DEFAULT PRIVILEGES FOR ROLE proj_api_owner IN SCHEMA proj_api GRANT EXECUTE ON FUNCTIONS TO proj_api_executor;
ALTER DEFAULT PRIVILEGES IN SCHEMA proj_api GRANT EXECUTE ON FUNCTIONS TO proj_api_executor;

SET search_path to proj_api, public;

DO $SQL$
BEGIN
  IF CURRENT_DATABASE() ~ '^(prod|staging|integration)_proj_db$' THEN
    -- change default search_path for production, staging and integration databases
    EXECUTE 'ALTER DATABASE ' || CURRENT_DATABASE() || ' SET search_path to proj_api, public;';
  END IF;
END
$SQL$;

SET role TO proj_api_owner;

This kind of layout gives a possibility to bootstrap API schema objects into a needed database easily and that is very important, to keep track of all the database logic changes in SCM system that lets you review and compare the changes between releases.

Bootstrapping into a development database can be done by a very easy script like:
(
echo 'DROP SCHEMA proj_api CASCADE;'
find 50_proj -type f -name '*.sql' \
  | sort \
  | xargs cat \
) | psql dev_proj_db -1 -f -
In case of development database, we are actually bootstrapping all the objects including tables into a freshly prepared database instance, so that integration tests can run and modify data as they want.

Injecting into a production or staging database can be automated and implemented with different kind of additional checks, but at the end it is something like:
(
cat 50_proj/00_create_schema.sql | sed s/proj_api/proj_api_vX_Y_Z/g 
find 50_proj -type f -name '*.sql' ! -name '00_create_schema.sql' \
  | sort \
  | xargs cat \
) | psql prod_proj_db -1 -f - 
So after that, we have a fresh copy of the whole shiny API schema with all the dependencies rolled out to the production database. And this schema objects are only accessed by the software, that is supposed to do so, that is tested to run with this very combination and this versions of the stored procedures and depended types. And if we see any problems with the rollout, we can just rollback the software stack so it can still access our old stored procedures, located in a schema with previous version of out API.

This method does not solve the problem of versioning of tables in our data schema (we would keep all the tables, related objects and low level transformation stored procedures in proj_data schema) but for that, there is a very simple, but very nice, solution, http://www.depesz.com/index.php/2010/08/22/versioning/ suggested and implemented by Depesz. Of cause, changes in table structure should be still kept backwards compatible and nicely written database diff rollout and rollback files should be written for every such change.

I am not going into details about how to prepare Springs configuration of the JDBC pools for the java clients or how to configure the bootstrapping for integration testing in your Maven project configuration as this information will not add any real value to this blog post that became much longer then I expected from the beginning.

NOTE: Because of a bug in PostgreSQL JDBC driver the types that are used as input parameters for stored procedures cannot be located in different schemas (TYPE OIDs are being searched only by name only, without consideration of a schema and search_path). Patching of the driver is very easy and we did so, in my company to be able to use the schema based versioning in our Java projects. I reported the bug twice already (http://archives.postgresql.org/pgsql-jdbc/2011-03/msg00007.php, http://archives.postgresql.org/pgsql-jdbc/2011-12/msg00083.php), but unfortunately no response from anybody. Probably have to submit a patch myself sometime.  

2011-10-30

How to remove SIM Card PIN from your GSM/UMTS modem on Ubuntu (Linux)

If your GSM modem SIM Card is configured with a PIN a NetworkManager is constantly trying to ask for that PIN on every wake up... and this is quite annoying indeed. So the easy way to remove a PIN protection from your SIM Card under Ubuntu would be:
sudo apt-get install gsm-utils
sudo gsmctl -d /dev/ttyACM0 -o unlock sc all 1234
here 1234 is actually your SIM Card PIN to be removed.

2011-04-06

Index sizes depending on the type of the field being indexed

The sizes of indexes (and tables of cause) are influencing the look up speed directly (the smaller are the corresponding file system files, the faster one can scan it, not saying anything about the sizes, needed to be kept in memory caches)

So I did some simple experiment to demonstrate the effect of choosing different field types to be used as index fields to the size of the indexes and tables.

DROP DATABASE IF EXISTS eval;
CREATE DATABASE eval;
\c eval
*/
SET work_mem TO '128MB';
SET maintenance_work_mem TO '128MB';

DROP SCHEMA IF EXISTS eval_schema CASCADE;
CREATE SCHEMA eval_schema;

SET search_path to eval_schema;

CREATE TABLE eval_config ( table_name text, id_field_type text, id_field_expression text, row_count integer DEFAULT 100000 );
INSERT INTO eval_config VALUES 
( 'integer_short_id_table', 'integer', 's.i' ),
( 'integer_large_id_table', 'integer', 's.i * 123' ),
( 'bigint_short_id_table', 'bigint', 's.i' ),
( 'bigint_large_id_table', 'bigint', 's.i * 123' ),
( 'text_number_short_id_table', 'text', 's.i' ),
( 'text_number_large_id_table', 'text', 's.i * 123' ),
( 'numeric_short_id_table', 'numeric', 's.i' ),
( 'numeric_large_id_table', 'numeric', 's.i * 123' ),
( 'binary_md5_text_table', 'bytea', $$decode( md5( s.i::text || '-text-filler'), 'hex' ) $$ ),
( 'md5_text_table', 'text', $$md5( s.i::text || '-text-filler' )$$ );

CREATE VIEW eval_table_stats AS
  SELECT t.relname as "Table name",
       c.id_field_type as "Indexed field type",
       c.id_field_expression as "Expression",
       c.row_count as "Row count",
       s.stawidth as "Average id field width",
       pg_size_pretty(pg_table_size(t.oid)) as "Table size without index",
       pg_size_pretty(pg_indexes_size(t.oid)) as "Index size" /*,
       n_tup_ins,
       n_tup_upd,
       n_tup_del,
       n_tup_hot_upd,
       n_live_tup,
       n_dead_tup */
  FROM eval_config as c
  JOIN pg_class as t
    ON c.table_name = t.relname
   AND t.relkind = 'r'
  JOIN pg_namespace as n
    ON relnamespace = n.oid
   AND n.nspname = 'eval_schema'
  LEFT 
  JOIN pg_statistic as s
    ON s.starelid = t.oid
   AND s.staattnum = 1
  LEFT
  JOIN pg_stat_user_tables as w
    ON w.relid = t.oid;

DO $SQL$
DECLARE 
  config record;
BEGIN
  FOR config IN SELECT * FROM eval_config
  LOOP
    RAISE INFO 'Creating table %', quote_ident( config.table_name );
    EXECUTE $$
    CREATE TABLE $$ || quote_ident( config.table_name ) || $$
    ( id $$ || config.id_field_type || $$, data text );
    $$;
    RAISE INFO 'Filling table %', quote_ident( config.table_name );
    EXECUTE $$
    INSERT INTO $$ || quote_ident( config.table_name ) || $$
    SELECT ( $$ || config.id_field_expression || $$ )::$$ || config.id_field_type || $$, 'some filling data'
      FROM generate_series(1, $$ || config.row_count || $$) as s(i);
    $$;
    RAISE INFO 'Building index on %', quote_ident( config.table_name );
    EXECUTE $$
    CREATE INDEX ON $$ || quote_ident( config.table_name ) || $$ ( id );
    $$;
    RAISE INFO 'Analyzing table %', quote_ident( config.table_name );
    EXECUTE $$
    ANALYZE $$ || quote_ident( config.table_name ) || $$;
    $$;
  END LOOP;
END;
$SQL$;

SELECT * FROM eval_table_stats;

DO $SQL$
DECLARE 
  config record;
BEGIN
  FOR config IN SELECT * FROM eval_config
  LOOP
    RAISE INFO 'Bloating table % (phase 1)', quote_ident( config.table_name );
    EXECUTE $$
    UPDATE $$ || quote_ident( config.table_name ) || $$
       SET data = data
     WHERE random() > 0.5;
    $$;
    RAISE INFO 'Bloating table % (phase 2)', quote_ident( config.table_name );
    EXECUTE $$
    UPDATE $$ || quote_ident( config.table_name ) || $$
       SET data = data
     WHERE random() > 0.5;
    $$;
    RAISE INFO 'Analyzing table %', quote_ident( config.table_name );
    EXECUTE $$
    ANALYZE $$ || quote_ident( config.table_name ) || $$;
    $$;
  END LOOP;
END;
$SQL$;

SELECT * FROM eval_table_stats;

As a result of execution of this script, we got several tables and some statistics on the table and index sizes.
I created the tables with id field with types: integer, bigint, text and numeric, additionally bytea and text for the md5 hash indexes. Actually the table size is including the filler text data, so it's size is not only the size of the fields being evaluated:

Table sizes just after insertion of 100T rows

Table name Indexed field type Expression Row count Average id field width Table size without index Index size
integer_short_id_table integer s.i 100000 4 5128 kB 1768 kB
integer_large_id_table integer s.i * 1234 100000 4 5128 kB 1768 kB
bigint_short_id_table bigint s.i 100000 8 5552 kB 2208 kB
bigint_large_id_table bigint s.i * 1234 100000 8 5552 kB 2208 kB
text_number_short_id_table text s.i 100000 5 5128 kB 2200 kB
text_number_large_id_table text s.i * 1234 100000 9 5552 kB 2624 kB
numeric_short_id_table numeric s.i 100000 8 5552 kB 2616 kB
numeric_large_id_table numeric s.i * 1234 100000 9 5624 kB 2656 kB
binary_md5_text_table bytea decode( md5( s.i::text || '-text-filler'), 'hex' ) 100000 17 6336 kB 3552 kB
md5_text_table text md5( s.i::text || '-text-filler' ) 100000 33 7880 kB 5328 kB

Table sizes after random bloating (2 times random update of 50% of rows)

Table name Indexed field type Expression Row count Average id field width Table size without index Index size
integer_short_id_table integer s.i 100000 4 10232 kB 5288 kB
integer_large_id_table integer s.i * 1234 100000 4 10232 kB 5272 kB
bigint_short_id_table bigint s.i 100000 8 11 MB 6592 kB
bigint_large_id_table bigint s.i * 1234 100000 8 11 MB 6560 kB
text_number_short_id_table text s.i 100000 5 10232 kB 5896 kB
text_number_large_id_table text s.i * 1234 100000 9 11 MB 7096 kB
numeric_short_id_table numeric s.i 100000 8 11 MB 7752 kB
numeric_large_id_table numeric s.i * 1234 100000 9 11 MB 7880 kB
binary_md5_text_table bytea decode( md5( s.i::text || '-text-filler'), 'hex' ) 100000 17 12 MB 7336 kB
md5_text_table text md5( s.i::text || '-text-filler' ) 100000 33 15 MB 11 MB

Table sizes just after insertion of 5M rows

Table name Indexed field type Expression Row count Average id field width Table size without index Index size
integer_short_id_table integer s.i 5000000 4 249 MB 86 MB
integer_large_id_table integer s.i * 123 5000000 4 249 MB 86 MB
bigint_short_id_table bigint s.i 5000000 8 269 MB 107 MB
bigint_large_id_table bigint s.i * 123 5000000 8 269 MB 107 MB
text_number_short_id_table text s.i 5000000 7 269 MB 107 MB
text_number_large_id_table text s.i * 123 5000000 9 269 MB 128 MB
numeric_short_id_table numeric s.i 5000000 8 269 MB 129 MB
numeric_large_id_table numeric s.i * 123 5000000 10 284 MB 129 MB
binary_md5_text_table bytea decode( md5( s.i::text || '-text-filler'), 'hex' ) 5000000 17 308 MB 172 MB
md5_text_table text md5( s.i::text || '-text-filler' ) 5000000 33 383 MB 259 MB

Table sizes after random bloating (2 times random update of 50% of rows)

Table name Indexed field type Expression Row count Average id field width Table size without index Index size
integer_short_id_table integer s.i 5000000 4 498 MB 257 MB
integer_large_id_table integer s.i * 123 5000000 4 498 MB 257 MB
bigint_short_id_table bigint s.i 5000000 8 539 MB 321 MB
bigint_large_id_table bigint s.i * 123 5000000 8 539 MB 321 MB
text_number_short_id_table text s.i 5000000 7 538 MB 287 MB
text_number_large_id_table text s.i * 123 5000000 9 539 MB 340 MB
numeric_short_id_table numeric s.i 5000000 8 539 MB 384 MB
numeric_large_id_table numeric s.i * 123 5000000 10 569 MB 384 MB
binary_md5_text_table bytea decode( md5( s.i::text || '-text-filler'), 'hex' ) 5000000 17 615 MB 354 MB
md5_text_table text md5( s.i::text || '-text-filler' ) 5000000 33 766 MB 545 MB

Ok, we see here, that actually it is a field byte width, that is important and for small numbers, the difference is really not so significant, but still an index on integer (4 bytes wide) filed is little smaller then index on text representation of the same ids even for tables that keep 10000 records only (these results are not posted here).

I specially was creating the tables in 2 variants, id as a result of 1..N series, and multiplied by 1234 or 123 to make the numbers wider in text representation to see how the length of the text representation of the number is influencing the size of the index.

Some more play with the smallint type, showed that it does save space in the table itself, but the index size is the same as for integer.

What was really surprising for me, that numeric type had bigger indexes then text indexes for the same numbers.

And of cause again and again -- indexes on some long text fields can be built on the md5 hash of such a text and be much more performant, when you index bytea version of the md5 hash instead of text representation of the md5 hash itself. Though md5 hash collisions can theoretically happen, I have not experienced them yet myself in practice.

2011-02-17

pgAdmin III macros: get table fields

When writing SQL statements of Stored Procedure code in pgAdmin, it is quite often quite handy to know the names and restrictions on the fields of some table.

Here is a simple macro that can be assigned to a key combination in pgAdmin -> Query Window -> Macros -> Manage macros...

select quote_ident(nspname) || '.' || quote_ident(relname) as table_name, 
       quote_ident(attname) as field_name, 
       format_type(atttypid,atttypmod) as field_type, 
       case when attnotnull then ' NOT NULL' else '' end as null_constraint,
       case when atthasdef then 'DEFAULT ' || 
                                ( select pg_get_expr(adbin, attrelid) 
                                    from pg_attrdef 
                                   where adrelid = attrelid and adnum = attnum )::text else '' 
       end as dafault_value,
       case when nullif(confrelid, 0) is not null 
            then confrelid::regclass::text || '( ' || 
                 array_to_string( ARRAY( select quote_ident( fa.attname ) 
                                           from pg_attribute as fa 
                                          where fa.attnum = ANY ( confkey ) 
                                            and fa.attrelid = confrelid
                                          order by fa.attnum 
                                        ), ',' 
                                 ) || ' )' 
            else '' end as references_to
  from pg_attribute 
       left outer join pg_constraint on conrelid = attrelid 
                                    and attnum = conkey[1] 
                                    and array_upper( conkey, 1 ) = 1,
       pg_class, 
       pg_namespace
 where pg_class.oid = attrelid
   and pg_namespace.oid = relnamespace
   and pg_class.oid = btrim( '$SELECTION$' )::regclass::oid
   and attnum > 0
   and not attisdropped
 order by attrelid, attnum;

Just select a table name, that you are interested in, and press the key binding, that you selected for that macros. pgAdmin will execute the query and show the list of all the columns of that table, including default values and foreign key references...

My preferred key binding is CTRL+1

Type of NULL is important

The following SQL query shows how different can be the result of concatenating arrays with NULL values:

select '{e1,e2}'::text[] || 't'::text     as normal_array_concatenation,
       NULL::text[] || 't'::text         as appending_element_to_existing_NULL_value_array, 
       NULL || 't'::text                 as appending_element_to_NULL_value,
       '{e1,e2}'::text[] || NULL::text   as appending_typed_NULL,
       '{e1,e2}'::text[] || NULL::text[] as appending_typed_NULL_array;

The result of the execution (on 9.0) is:

─[ RECORD 1 ]──────────────────────────────────┬─────────────
normal_array_concatenation                     │ {e1,e2,t}
appending_element_to_existing_null_value_array │ {t}
appending_element_to_null_value                │ 
appending_typed_null                           │ {e1,e2,NULL}
appending_typed_null_array                     │ {e1,e2}

That explains why one can simply initialize a new array variable in PL/pgSQL and then immediately start concatenating it with values, not pre-initializing it with an empty array (that you have to do when you want to populate a text string... you cannot just take a NULL::text variable and start concatenating strings to it, you have to first pre-inizialize it with en empty string, but with arrays you can).

Another very important issue, that is related to that beheviour of arrays, is that when creating dynamic SQL queries for EXECUTE command in PL/pgSQL you SHOULD always put explicit type casts to the variables that you quote using quote_literal() or quote_nullable() when building dynamic queries:

DO $SQL$
DECLARE
  r text[];
  t text   := NULL;
  a text[] := NULL;
BEGIN

  SELECT '{e1,e2}'::text[] || a INTO r;
  RAISE INFO 'array concatenate: r is %', r;

  EXECUTE $$SELECT '{e1,e2}'::text[] || $$ || quote_nullable(a) || $$::text[] $$ INTO r;
  RAISE INFO 'array concatenate from EXECUTE: r is %', r;

  SELECT '{e1,e2}'::text[] || t INTO r;
  RAISE INFO 'array append: r is %', r;

  EXECUTE $$SELECT '{e1,e2}'::text[] || $$ || quote_nullable(t) || $$::text $$ INTO r;
  RAISE INFO 'array append from EXECUTE: r is %', r;

  /* These 2 examples will fail on the runtime 
     throwing exception (operator is not unique: text[] || unknown)
  --EXECUTE $$SELECT '{e1,e2}'::text[] || $$ || quote_nullable(a) INTO r;
  --RAISE INFO 'r is %', r;
  --
  --EXECUTE $$SELECT '{e1,e2}'::text[] || $$ || quote_nullable(t) INTO r;
  --RAISE INFO 'r is %', r;
  */

END;
$SQL$;

2010-11-24

Reflecting generic type parameters in Java

If you one is introspecting an object class in Java, it was always a problem to be able to find out, what is the actual type of the generic type parameter for a class field or property.

Ok, yesterday Daniel and I ware listening to a presentation about EclipseLink from a guy from Oracle (or a company, that is working with Oracle, but this is not so important). The question about Generic Type erasure was in our minds all the time, when the guy showed how EclipseLink is generating adapter classes for his model classes. Ok, we asked about the issue, and he showed us how one can extract generic type parameter information for object fields using reflection methods. This approach will work only on the Field classes actually, but this is more then enough to be able to extract actual generic parameter types for class fields :)

Here is a simple example, that I made inspired by this information:

import java.lang.reflect.Field;
import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;
import java.lang.reflect.TypeVariable;
import java.util.Arrays;
import java.util.Map;

import org.junit.Before;
import org.junit.Test;

public class GenericTypeParameterDemo {

 static final class MyGenericType<e extends Number> {
  public Map<String, E> map;
 }
 
 @Before
 public void setUp() throws Exception {
 }
 
 @Test
 public void extractGenericParameters() throws Exception {
  Field field = MyGenericType.class.getDeclaredField("map");

  Type fieldGenericType = field.getGenericType();
  ParameterizedType parametrizedFieldType = (ParameterizedType) fieldGenericType;

  for (Type actualParameterType : parametrizedFieldType.getActualTypeArguments()) {
   if ( actualParameterType instanceof TypeVariable<?>) {
    Type[] bounds = TypeVariable.class.cast(actualParameterType).getBounds();
    System.out.println(actualParameterType.toString() + 
     " with bounds " + Arrays.toString(bounds) );
   } else {
    System.out.println(actualParameterType);
   }
  }
 }
}

It is really simple, as you can see on lines 25 and 26. The trick is to really check if the result of Field.getGenericType()
is actually an instance of ParameterizedType. One can look up also other children of Type interface, that can bring more details on the structure of your field types.