Monday, December 4, 2017

CAP Theorem (update)

Reading further on the CAP-Theorem, I found this post from Daniel Abadi, where the author explains the limitations on the CAP-Theorem, and proposes a revisited version CAPELC, where Latency is also put along the consistency, availability and partition tolerance.

Thursday, November 16, 2017

CAP Theorem

A useful reminder on the CAP Theorem (aka Brewer theorem), as I was studying a MOOC on HDFS.
What the CAP Theorem says is that on a distributed system, it is impossible to guarantee more than 2 out of the three following constraints:
  • Consistency: every read receives the most recent version of the data
  • Availability: every read receives an answer, without guarantee that it contains the most recent write.
  • Partition tolerance: the system operates even when the network between the nodes fails(except in case of a total breakdown). 

The following DBMS can be positioned as follows:
  • C+A : Oracle, MySql
  • C+P : HDFS, MongoDB
  • A+P: Redis, Elasticsearch



Wednesday, July 5, 2017

A full table scan is not always bad, an index is not always good

This is a well known fact: (almost) every developer knows that "sometimes, a full table scan can perform better than using an index".

Lets show this in a little demonstration, which I picked up from Tom Kyte's Expert Oracle Database Architecture. The purpose of the demo is to show that an index access can be more expensive than a full table scan, under some circumstances.

I'm running this demo on a Oracle XE 11gR2 instance, running on a Lenovo X1 Yoga notebook, running Windows 10 (64 bits), with an Intel Core i7-6600U CPU (2.60 GHz), and 16 GB RAM.


Setup


Let's first create a table to hold data with a column having values sorted:

CREATE TABLE ORDERED (X NUMBER, Y VARCHAR2(2000), PRIMARY KEY(X));

Now populate ORDERED, taking care of sorting the data in the column X from 1 to 1'000'000. 

begin
  for i in  1 .. 1000000 
  loop
    insert into ordered 
    values(i, rpad(dbms_random.random, 75, '*'));
  end loop;
end;
/
commit;


... then create a table UNORDERED with identical structure, and populate it with the data from the first table, putting the values in the column X in random order:

CREATE TABLE UNORDERED AS SELECT X, Y FROM ORDERED ORDER BY Y;
ALTER TABLE UNORDERED ADD CONSTRAINT PK_UNORDERED PRIMARY KEY (X);

And before starting with the demo, gather the statistics:
begin
  dbms_stats.gather_table_stats(USER, 'UNORDERED');
  dbms_stats.gather_table_stats(USER, 'ORDERED');
end;
/

Comparing the execution plans

1. Table with ordered data:

SQL Statement:
$EXPLAIN PLAN FOR SELECT * FROM ORDERED WHERE X BETWEEN 10000 AND 30000;

------------------------------------------------------------------------------------------
|Id   | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             | 20002 |  1582K|   323   (0)| 00:00:04 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ORDERED     | 20002 |  1582K|   323   (0)| 00:00:04 |
|*  2 |   INDEX RANGE SCAN          | SYS_C007666 | 20002 |       |    68   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

2. Table with unordered data, full table scan:

SQL Statement:
EXPLAIN PLAN FOR SELECT * FROM UNORDERED WHERE X BETWEEN 10000 AND 30000;

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 20002 |  1582K|  3275   (1)| 00:00:40 |
|*  1 |  TABLE ACCESS FULL| UNORDERED | 20002 |  1582K|  3275   (1)| 00:00:40 |
-------------------------------------------------------------------------------

3. Table with unordered data, using the index:

SQL Statement:
EXPLAIN PLAN FOR SELECT /*+ index( UNORDERED PK_UNORDERED ) */* 
FROM UNORDERED WHERE X BETWEEN 10000 AND 30000;

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              | 20002 |  1582K| 20049   (1)| 00:04:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| UNORDERED    | 20002 |  1582K| 20049   (1)| 00:04:01 |
|*  2 |   INDEX RANGE SCAN          | PK_UNORDERED | 20002 |       |    44   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------


Conclusion


Execution plans displays a much higher cost when the index is used on an unordered set of data - in that particular case. This might be unexpected, but it shows how important physical design is. So blindly putting indexes when encountering full table scans is not always a good idea. 

Trying to change the optimizer first choice might also not be a good idea. Oddly, there's often in room a developer tempted to prevent the optimizer to  to use a full table scan. Not good.

Finally, let's also point out that maintaining an index is also bound to costs.

Friday, June 30, 2017

What is the difference between npm install and npm update:

I keep doing stuff that I don't understand.

I was wondering what was the reason we were instructed to always run npm install an npm update after pulling from the repository. Couldn't a single command do it?


So, here's a short note on the difference between npm install and npm update:

Sample package.json file:

{
  "name":          "my-project",
  "version":       "1.0",                             // install   update
  "dependencies":  {                                  // ------------------
    "already-installed-versionless-module":  "*",     // ignores   "1.0" -> "1.1"
    "already-installed-semver-module":       "^1.4.3" // ignores   "1.4.3" -> "1.5.2"
    "already-installed-versioned-module":    "3.4.1"  // ignores   ignores
    "not-yet-installed-versionless-module":  "*",     // installs  installs
    "not-yet-installed-semver-module":       "^4.2.1" // installs  installs
    "not-yet-installed-versioned-module":    "2.7.8"  // installs  installs
  }
}
Conclusion: The only big difference is that an already installed module with fuzzy versioning ...
  • gets ignored by npm install
  • gets updated by npm update

Thursday, June 29, 2017

PL/SQL: Simple cursors for example

Here a very basic demonstration of a PL/SQL program that updates data from a table. It uses an explicit cursor for loop:

The program is based on the EMPLOYEES table in HR Schema.

DECLARE

  CURSOR C1 IS 
  SELECT * FROM EMPLOYEES WHERE DEPARTMENT_ID = 60 
  FOR UPDATE;

BEGIN

  SAVEPOINT TX_START;

  FOR RC IN C1 LOOP

    DBMS_OUTPUT.PUT_LINE('Fetching record ' || RC.LAST_NAME);

    IF RC.SALARY > 5000 THEN
      UPDATE EMPLOYEES SET SALARY = SALARY*1.1 WHERE CURRENT OF C1;
    END IF;

  END LOOP;

  EXCEPTION

  WHEN OTHERS THEN
    ROLLBACK TO TX_START;
    DBMS_OUTPUT.PUT_LINE(SQLERRM);
    RAISE;

END;
/

Thursday, May 11, 2017

How to use XStream to generate xml from objects

I have to instrumentalize my code to generate tests. For that purpose, I use serialization to capture my objects state from the server while running some system tests. And I use the generated files for my IT-Tests using deserialization.

My colleague Fabrizio provided me with a neat utility that generates json files from all my serializable classes.

But I was confronted with difficulties with the complexity of some object graphs, so I use Xstream to generat xml.

Here's how to dump an object to xml:

  XStream xstream = new XStream();
  String xml = xstream.toXML(myObject);
  String filePath = "output.xml";
  try{
      new File(filePath).createNewFile();
      try (Writer writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(filePath), "utf-8"))) {
          writer.write(xml);
          writer.flush();
      }
  } catch (IOException e){
      e.printStackTrace();
  }

Wednesday, March 22, 2017

How to manipulate dates in SQL (demo)

This is a reminder on how to manipulate dates in SQL (Oracle).

Setup:

I'll create a simple table a populate it with a few data:
create table date_test(id number, date1 date, ts1 timestamp, date2 date, ts2 timestamp);

insert into date_test values (1, 
to_date('2014-06-27', 'RRRR-MM-DD'), 
to_date('2014-06-27', 'RRRR-MM-DD'), 
to_date('2014-06-27', 'RRRR-MM-DD'), 
to_date('2014-06-27', 'RRRR-MM-DD'));

Data selection examples:

SELECT CURRENT_TIMESTAMP FROM DUAL;

SELECT CURRENT_TIMESTAMP, CURRENT_TIMESTAMP + INTERVAL '1' SECOND(1) 
FROM DUAL;

SELECT  TO_CHAR (TS1, 'RRRR-MM-DD, HH24:MI:SS') AS ORI, 
TO_CHAR (TS1-INTERVAL '1' SECOND(1),'RRRR-MM-DD, HH24:MI:SS') AS FORMATTED 
FROM DATE_TEST;

See also this page.

How to use the CASE clause in a SQL statement (a demo)

This is a little demo to show how to use the CASE clause in a SQL statement:

drop table two_fields;

drop table two_fields2;

create table two_fields (id number, text varchar2(200), field1 varchar2(10), field2 varchar2(10));

create table two_fields2 (id number, text varchar2(200), field12 varchar2(10));

insert into two_fields values (1, 'text1', 'f1-1', null);
insert into two_fields values (2, 'text2', 'f1-2', null);
insert into two_fields values (3, 'text3', 'f1-3', null);
insert into two_fields values (4, 'text4', null, 'f2-1');
insert into two_fields values (5, 'text5', null, 'f2-2');
insert into two_fields values (6, 'text6', null, 'f2-3');

commit;

select * from two_fields;

select * from two_fields2;

select id, text, field1, field2, case field1 when null then 'field2' else 'field1' end as f1f2 from two_fields;

select id, text, field1, field2, decode (field1 , null, field2, field1) f1t from two_fields;

INSERT INTO TWO_FIELDS2 (SELECT ID, TEXT,  decode (field1 , null, field2, field1) from two_fields);


ALTER TABLE two_fields ADD CONSTRAINT c1 check (field1 is not null or field2 is not null);

ALTER TABLE two_fields ADD CONSTRAINT c2 check ((field1 is not null and field2 is null) OR (field1 is null and field2 is not null))

Find out the columns used in a referencial integrity constraints

Using the data dictionary views USER_CONSTRAINTS and USER_CONS_COLUMNS, it is possible to find out what columns are used in a referencial integrity constraint:


SELECT 
  UCC1.TABLE_NAME, 
  UCC1.COLUMN_NAME, 
  UCC1.CONSTRAINT_NAME, 
  UC.R_CONSTRAINT_NAME, 
  UCC2.TABLE_NAME
FROM USER_CONSTRAINTS UC , 
  USER_CONS_COLUMNS UCC1, 
  USER_CONS_COLUMNS UCC2
WHERE UC.CONSTRAINT_TYPE = 'R'
  AND UCC1.CONSTRAINT_NAME = UC.CONSTRAINT_NAME
  AND UCC2.CONSTRAINT_NAME = UC.R_CONSTRAINT_NAME

How to find out which roles does a user have

I often have to find out whether a particular user has a certain role.

This is the command to read the AD - Roles of a user:

U:\>net user z100079 /domain
The request will be processed at a domain controller for domain mobi.mobicorp.ch.

User name                    Z100079
Full Name                    Z100079
Comment                      Service Account für "Vertragsverlauf", Gysling Eva
User's comment
Country/region code          000 (System Default)
Account active               Yes
Account expires              30.12.2099 23:00:00

Password last set            27.01.2011 13:30:38
Password expires             Never
Password changeable          27.01.2011 13:30:38
Password required            Yes
User may change password     Yes

Workstations allowed         All
Logon script
User profile
Home directory
Last logon                   Never

Logon hours allowed          All

Local Group Memberships
Global Group memberships     *AMPS_WS              *Domain Users
                             *ASES-INTERN          *AISDV_CONSUMER
                             *ACONTRACTREADCLIENT  *ASES-EXTERN
                             *ALOSSRATIOREADCLIENT *ACLAIMREADCLIENT

Sunday, January 15, 2017

Angular JS and URLs

While trying to run a demo application from a tutorial from OpenClassroom, I was confronted with the problem that all my URLs were rewritten with a trailing #. This ruined my navigation. 

 From a few contributions in Stackoverflow (especially this one), I found a solution:
1. Configuring $locationProvider 
- This causes the browser running in html5mode, removing the need for adding # to the url

2. Setting the base for relative links 

1. Configuring $locationProvider

in the file app.js

angular.module('cineAngularApp', ['ngRoute'])
  .config(function ($routeProvider, $locationProvider) {
    $routeProvider
      .when('/', {
         ...
      });
    $locationProvider.html5Mode(true);  });

2. Setting the base for relative links

in index.html:
<html> <base href="/"> ... </html>
Afterwards, I still had the problem that all urls were rewritten with a %2F. This is related to the fact that AngularJS 1.6 has changed the default for hash-bang urls in the $location service.
Just have to configure $locationProvider to change the behavior (see this post): $locationProvider.hashPrefix('');