Sunday, December 18, 2011

Database research journals and articles

Www.dbjournal.ro
Www.doaj.org

Mit database group
Db.csail.mit.edu

Cs-www.cs.yale.edu/homes/dna/pubs
/displaypubs.cgi

Thursday, December 15, 2011

Database Conferences in the world



http://academic.research.microsoft.com/RankList?entitytype=3&topDomainID=2&subDomainID=18&last=0&start=1&end=100

http://www.vldb2010.org/accept.htm


research at University of Wisconsin - Madison
https://database.cs.wisc.edu/research/


http://www.quora.com/Database-Systems/What-are-the-best-recommended-research-topics-on-databases-according-to-edge-technologies-and-recent-research-trends

DB News
http://www.tdan.com/view-featured-columns/15299



http://db.cs.berkeley.edu/oldprojects.php

http://www.dvs.tu-darmstadt.de/

http://www.cs.cornell.edu/bigreddata/dbcolloquium.php

http://davis.wpi.edu/dsrg/publications.html

http://dbs.ifi.uni-heidelberg.de/index.php?id=53


--international conferences in Databases

SIGMOD
http://www.sigmod2011.org/index.shtml
http://www.informatik.uni-trier.de/~ley/db/conf/sigmod/index.html

ICDE
http://www.informatik.uni-trier.de/~ley/db/conf/icde/index.html

PODS
http://www.informatik.uni-trier.de/~ley/db/conf/pods/index.html

ICDT
http://www.informatik.uni-trier.de/~ley/db/conf/icdt/index.html

VLDB
http://www.informatik.uni-trier.de/~ley/db/conf/vldb/index.html

IDAR
http://idar08.comp.nus.edu.sg/

JCSS
http://conference.researchbib.com/?action=viewEventDetails&eventid=18349&uid=rb9c01

BIRTE
http://conference.researchbib.com/?action=viewEventDetails&eventid=18040&uid=r5403e

EDBT
http://www.edbt.org/

DEBS
http://debs10.doc.ic.ac.uk/

CIKM
http://www.cikm2011.org/

Tuesday, November 15, 2011

The world of Databases - Proud to be a DBA

I fell in love with Databases, its a slow process and I didn't realize when it all started.
I heard people loving their profession and crazy about what they do in their offices but never experienced before ( this is true a couple of years ago).
I started my career with databases, vb, asp and .net, i chose or probably life chose Databases for me to continue further.
I didn't object then as I knew there is something that I love in writing queries, designing databases, building, rebuilding databases, studying table structure, columns, indexes, primary keys and so on and on and on.
I moved ahead with my career and today I am in my 3rd company since I started my career 7.5 years ago.
I am continuing working on Databases. Along with it my love and passion for Databases grew day by day and it started a part of my life now.
Like what people say, I am inseparable from the Database world.
I now realize that I've already fallen in love, enjoying my professional choice and proud to be a "DBA".
These three words represent a thousand words about a person working as a DBA.
I am dedicating this post to all the DBAs out there.


DBAs are irreplaceable. 
Every DBA has to endure so much pain, has to face abuse from so many people.
Every known/unknown face in the company talks about databases without actually knowing what is talking about and then starts screaming that there is something wrong with the Database.
DBAs have to work in shifts, 24/7, 365 days a year yet he receives no additional bonus or gains. No one even shows some some pity.
Any issue, any application impact and that's all for that day, our entire day will be with various people answering different questions, attending meetings, explaining them repeatedly again and again the issue, resolution, analysis etc etc.
Despite doing all this, people start complaining that its because we couldn't do our job properly and hence is the issue. There wouldn't be a single person who comes to our rescue and help us. 
Organizations should be proud of having a DBA. DBAs protect organizational data, we strive hard for backing up the data every single hour, every single day. we struggle for hours and days and fix issues.
We are the ones who never get appreciated.
We are the ones who are treated as a bunch of fools, non-engineers who don't know anything about IT except handling their databases.  How insane is this thought? 


DBAs are the tech gurus. A DBA designs database for Aircrafts for space shuttles and what not. He builds databases for Retail stores, for multi billion dollar organizations, stock exchanges, oil companies, Biological research and for so many different things in the world.
Every person in an organization IT is dependent on a DBA but still no one treats him with respect.


For eg:
1) There is a new project coming in, call the DBA please!!
2) Are we migrating our applications, where is your DBA?
3) Are we consolidating, downsizing, did u speak to your DBA?
4) Are you demerging, merging, acquiring a company, have you discussed with your DBA?
5) My company is bankrupt, really!! go and speak to your DBA first.
6) The world is coming to an end. God save the world but we, DBAs are the last persons on this earth who still work on backing up the data one final time :)




To all IT Analysts/Business owners/CIOs/Directors,
You should be proud of having a DBA in your organization. He is the one who protects your whole company data, maintaining your every single transaction happening anywhere across your company. 
Please accept DBAs as saviors of your company, give good bonus and keep us happy. If we are happy and doing our job properly then understand that your complete IT is in safe hands.


Proud to be a DBA

Wednesday, August 17, 2011

Buffer count and MaxTransferSize for backups in sql server

Here is the procedure to get the Buffer count and MaxTransfersize for your backups in sql servers

use the following trace flags:
DBCC TRACEON (3605)
DBCC TRACEON (3213)

Then run your backups and you can see the following information in your errorlog.

Backup/Restore buffer configuration parameters
Memory limit: 6143MB
Buffer count:               6
Max transfer size:          64 KB
Min MaxTransfer size:       64 KB
Total buffer space:         0 MB
Buffers per read stream:    6
Buffers per write stream:   6
Tabular data device count:  1
FileTree data device count: 0
Filesystem i/o alignment:   512

Monday, August 1, 2011

Estimate Compression in sql server 2008

Another good feature in 2008.
We can estimate the compression before actually executing it against the tables.
Here is the stored proc:

sp_estimate_data_compression_savings
or
sp_estimate_data_compression_savings <schemaname>,<tablename>,null,null,'page'

null,null are index id and partition numbers. you can give the specific index ids or partition no.s if you know otherwise null will list down for all partitions and all indexes on that table.
<schemaname>,<tablename>,null,null,'row'

Filtered indexes in sql server 2008

Just read a blog of Amit Bansal about Filtered indexes.
This is a really good feature.
You can create indexes on only a set of data like this:

create index on table(col1) where col1>234

Using this we can actually filter out the old data which is not used frequently and create index only on the current live data.

You can check the article here:
http://technet.microsoft.com/en-us/library/cc280372.aspx

Thursday, July 28, 2011

SQL Server IO

Excellent articles on SQL Server IO best practices and architecture are here:

SQL Server 2000 I/O Basics

SQL Server IO Best Practices Article

We can use the SQL Server IO stress tool to find out issues with IO subsystems on your servers:

SQLIOStress.exe
http://support.microsoft.com/default.aspx?scid=kb;en-us;231619

SQLIO.exe
http://download.microsoft.com/download/f/3/f/f3f92f8b-b24e-4c2e-9e86-d66df1f6f83b/SQLIO.msi

A few questions I have on the architecture:
Why sql server page size is 8Kb?
Why extend size is always 64 kb?
Heard that in oracle we can configure the page size, extent sizes for individual tables.
I still have to find out more about the IO disk subsystem, OS data blocks, Storage block sizes etc to find out why MS chose nonconfigurable sizes for pages and extents.
I'll updated this blog soon.

Thursday, July 21, 2011

row_number() function in sql server to get row offsets

This function started in sql server 2005 onwards.

eg:
select

you will get the records with each row having a row number like this:

rowno objectid
1          371178
2          371179
3          371181
4          371182


This is good when you want in between rows; lets say you want rows between 125 and 226 rows.
Though you can use between but between actually searches the data between the mentioned columns on which we are searching/filtering.

row_number() over (order by projectid) as rowno,objectid,tid,name from csforum

Tuesday, July 5, 2011

How to find objects/plans in memory in sql server

select
this will give you the database id, page id,page_type, row count so you can identify which pages(data/index) are in buffer for that particular database

select
this will as well give you the cached objects (compiled plans, adhoc, extended procs)

or

select
this will give the same result as syscacheobjects

select * from sys.dm_exec_plan_attributes(pass the plan handle from the above query here)
* from sys.dm_exec_cached_plans * from syscacheobjects * from sys.dm_os_buffer_descriptors

Monday, July 4, 2011

CLR App domains

Since sql server 2005 we got this new feature where we can load .net assemblies to the database and execute all our app functions.

You can view the existing app domains using the query:

select * from sys.dm_clr_appdomains

and the assemblies which are loaded can be seen here:
select * from sys.dm_clr_loaded_assemblies

The below error is due to insufficient CLR memory allocated:
to resolve this, you can use the startup parameter "-g" and give additional memory to the external assemblies and objects. By default sql server allocates 256 MB memory to external assemblies.

AppDomain 129 (testdb.dbo[runtime].135) is marked for unload due to memory pressure.

Tuesday, May 24, 2011

Clear DMV stats

Just found that we can clear DMV stats using the below query:

DBCC SQLPERF('sys.dm_os_latch_stats', CLEAR)
here you can mention any other DMV where you want to clear before starting your analysis.

Monday, April 18, 2011

Tables creation script

Just completed a script which generates "create tables" script.

Yet to add "defaults", "constraints", "indexes"., 'll update the script after I add these as well.

This is just a basic script, it doesn't include identity but it has default constraints added.

I just found its too very complex to get an accurate script because you need to take into account whether the column is marked for replication, whether identity is set on it, if it is computed or not, foreign keys and so on.
I am sorry to leave this unfinished but I lost patience in just trying to correct it/rewrite it, so 'll continue later.
But this script is still treated as a basic one to generate table scripts

Declare @query nvarchar(4000)
Declare @tbl_name varchar(50)
declare @schema varchar(10)
declare @subquery nvarchar(2000)

Declare  Del_arch_tbls cursor for

    Select table_name,table_schema from information_schema.tables where table_type<>'view'
   
Open del_arch_tbls
Fetch next from del_arch_tbls into @tbl_name,@schema

While @@fetch_status=0
begin

    declare @col nvarchar(30)
    declare @data_type nvarchar(10)
    declare @chr_length int
    declare @dt_precision int
    declare @isnull char(4)
    declare @collation nvarchar(40)
    declare @position int
    declare @num_precision int
    declare @col_query nvarchar(2000)
    declare @default varchar(100)
   
    --set @col_query=''
    Declare  cur_col cursor for

    Select column_name,data_type,isnull(character_maximum_length,0),is_nullable,isnull(numeric_precision,0),
    isnull(datetime_precision,0), isnull(collation_name,''),ordinal_position,isnull(column_default,'')
    from information_schema.columns where table_schema= @schema and table_name= @tbl_name
   
    open cur_col
    fetch next from cur_col into  @col,@data_type,@chr_length,@isnull,@num_precision,@dt_precision,
                            @collation,@position,@default
    while @@fetch_status=0
    begin

        -- 1.no need to add precision for datetime, int,smallint and bit, sql won't allow
        -- 2.actual generate script is not showing any collation levels for columns and hence
        -- commenting that code as well

        Set @col_query =  ' [' + @col + ']' + ' [' + @data_type + '] '
            if @chr_length >0
            begin
                set @col_query= @col_query + '('+cast(@chr_length as varchar)+') '
                if len(@default)>0
                    set @col_query= @col_query +  ' default ' + @default

                 --set @col_query= @col_query +  ' collate ' + @collation
            end
            else if @data_type='numeric'
            begin
                set @col_query = @col_query + '('+cast(@num_precision as varchar)+',0)'
                if len(@default)>0
                    set @col_query= @col_query +  ' default ' + @default
            end
--            else if @dt_precision >0
--            begin
--                set @col_query= @col_query + '('+cast(@dt_precision as varchar)+',0)'
--                if len(@default)>0
--                    set @col_query= @col_query +  ' default ' + @default
--            end
            if @isnull='yes'  set @col_query=@col_query + ' null'
            else
                set @col_query=@col_query + ' not null '
           
           
        --print @col_query
        --exec Sp_executesql @query
        --set @col_query=@col_query + ',' + @col_query
       
        if len(@subquery)>0
            set @subquery=@subquery + ',' + @col_query
        else
            set @subquery= @col_query

        fetch next from cur_col into  @col,@data_type,@chr_length,@isnull,@num_precision,@dt_precision,
                            @collation,@position,@default
       
    end
   
    Set @query = 'Create table [' + @schema + '].[' + @tbl_name  + '] (' + @subquery + ') on primary '

    print @query

    Close cur_col
    Deallocate cur_col

    set @subquery=''

    Fetch next from del_arch_tbls into @tbl_name,@schema

end


Close del_arch_tbls
Deallocate del_arch_tbls

Thursday, April 7, 2011

Audits in sql server 2008

Just noticed something new which started in sql server 2008.
Audits.

You can create and enable audits so all the related information like login success/failures, db backups/restores etc. will be logged in your audit file.
Its relatively simple to do.
I think this feature exists only in enterprise edition after sql server 2008

In order to create one here is the simple procedure:
1) create audit  by going to SSMS-> security tab->Audits tab
You can either choose the audit to be logged in windows application event log or security event log or your own file.

CREATE SERVER AUDIT [Audit-20110407-190438]
TO FILE
(    FILEPATH = N'W:\Audit\'
    ,MAXSIZE = 0 MB
    ,MAX_ROLLOVER_FILES = 2147483647
    ,RESERVE_DISK_SPACE = OFF
)
WITH
(    QUEUE_DELAY = 1000
    ,ON_FAILURE = CONTINUE
    ,AUDIT_GUID = '824f0cb7-328a-4875-89aa-dcf42ba7bceb'
)
GO

2) Once done, you need to create server audit specifications where each one should be associated with an Audit.
for each server audit specification, you can use events to be audited like "backup/restore", "login success, login failure, db schema change, etc.

CREATE SERVER AUDIT SPECIFICATION [ServerAuditSpecification-20110407-190608]
FOR SERVER AUDIT [Audit-20110407-190438]
ADD (BACKUP_RESTORE_GROUP),
ADD (FAILED_LOGIN_GROUP),
ADD (SUCCESSFUL_LOGIN_GROUP),
ADD (DATABASE_OBJECT_CHANGE_GROUP),
ADD (DATABASE_OWNERSHIP_CHANGE_GROUP)
WITH (STATE = ON)
GO

3) You can also enable DB audit specifications like "delete, insert, update, dbcc" etc.
so its like whenever user Test inserts some data in a particular table, it will be logged in the audit file


CREATE DATABASE AUDIT SPECIFICATION [DatabaseAuditSpecification-20110407-192130]
FOR SERVER AUDIT [Audit-20110407-190438]
ADD (UPDATE ON OBJECT::[RetailSales].[CustomerRetailerXRef] BY [db_owner]),
ADD (DELETE ON OBJECT::[dbo].[Profiles] BY [db_owner])
WITH (STATE = ON)
GO
I've created all these using SSMS GUI but you can do using scripts as well like mentioned above.
this ofcourse fills up your audit log file, so make sure you drive has enough space.

You can query your audit file using SSMS or from the below queries:

SELECT * FROM sys.fn_get_audit_file
 ('w:\Audit\Audit-20110407-190438_824F0CB7-328A-4875-89AA-DCF42BA7BCEB_0_129467019443560000.sqlaudit',default,default)

You can get your audit file information using the below query
select * from sys.server_audits
select * from sys.server_audit_specifications

 select * from sys.database_audit_specifications

One more function to get the Audit info from DMVs

select * from sys.dm_audit_actions

index usage and physical stats


select * from sys.dm_db_index_usage_stats where object_id=1779185784 will give you index usage stats like user seeks, scans, laster user seek , last lookup, last system scan etc.
this will help you in identifying which of the indexes in the table are in use since your last sql server restart.

select * from sys.dm_db_index_physical_stats(5,1779185784,1,1,'detailed')
                                                                (dbid,obj id, index id, partition no,options)
options - detailed, sample,limited,null

This will give you the fragmentation percentage much like dbcc showcontig, using this info we can identify if we need to rebuild that particular indexe

select * from sys.dm_db_index_operational_stats(5,1779185784,null,null)
                                                                     (dbid,obj id, indx id, partion no)

This will give details like leaf delete count, leaf insert count, leaf update count, page lock count, row lock count etc.
using this we can identify how many times inserts,updates and deletes happened for this table and how many different types of locks have been acquired on these indexes.

Tuesday, April 5, 2011

Missing Indexes

Have gathered some info about missing indexes.

there are limitations like the below:

1) it doesn't specify order for the columns to be created if you want to create the missing index
2) it can't gather statistics for more than 500 missing index groups
3) doesn't give accurate statistics for inequality predicates
eg. for queries such as this
select * from table1 where col1>33

Now here is how to analyze the missing index DMVs and create indexes for those.

Gather the missing indexes using the query:

select * from sys.dm_db_missing_index_details where database_id=5 and object_id=1779185784

gather the missing index index_handle

select * from sys.dm_db_missing_index_groups where index_handle in
(
select index_handle from sys.dm_db_missing_index_details where database_id=5 and object_id=1779185784
)

you can see the columns in the first query but can get for each index handle group using the below query

select * from sys.dm_db_missing_index_columns(pass the index_handle)

create indexes based on the missing indexes like this

first mention the indexes under "equality" predicate, then under "inequality" and then finally include the columns under "include" category.

Tuesday, March 15, 2011

Almost every major software and product is written in C,C++

You know what, Oracle RDBMS is written in C.
Unix OS is written in C,C++
Linux is written in C
MS Windows is written in C,C++
VB is written in Basic (exception)
MAC OS is written in C,C++
Java Compiler is written in C

There are other high level languages like COBOL, FORTRON, BASIC etc.
'll find out more soon

Wednesday, March 2, 2011

sys.dm_os_buffer_descriptors

select d.database_id,d.file_id,d.page_id,d.page_type,d.row_count,object_name(p.object_id),a.allocation_unit_id
from sys.dm_os_buffer_descriptors  d
left outer join sys.allocation_units a
on d.allocation_unit_id=a.allocation_unit_id
inner join sys.partitions p
on a.container_id=p.partition_id
where d.database_id>4
group by
d.database_id,d.file_id,d.page_id,d.page_type,d.row_count,object_name(p.object_id),a.allocation_unit_id

This query will give you all the objects in Buffer group by Database.
Remember, db_id of 32767 is Resource DB

Also just posted a question to Paul, waiting for a response from him:

When I the run the following query in sql server 2005

select * from sys.dm_os_buffer_descriptors where is_modified<>0

database_id
file_id
page_id
page_level
allocation_unit_id
page_type
row_count
free_space_in_bytes
is_modified
2
1
28919
0
1.48948E+14
INDEX_PAGE
0
8096
1
2
1
143
0
1.48948E+14
INDEX_PAGE
0
8096
1
2
1
152
0
5.6295E+14
INDEX_PAGE
135
3230
1
2
1
153
0
2.81475E+14
INDEX_PAGE
3
8045
1
2
1
154
0
2.81475E+14
INDEX_PAGE
3
8045
1

what does “is_modified” column signify?

Does it mean that the data has been modified and still not committed?

Also I could see rows with file_id=2(ldf files) when I run the below:
select * from sys.dm_os_buffer_descriptors where file_id>1

Tlog doesn’t have pages, so I could see page_id As zero but the page_types are confusing.

database_id
file_id
page_id
page_level
allocation_unit_id
page_type
row_count
free_space_in_bytes
is_modified
7
2
0
0
6488064
FILEHEADER_PAGE
2
6892
0
2
2
0
0
2.81475E+14
DATA_PAGE
5
4411
0
5
2
0
0
7.20576E+16
DATA_PAGE
16
192
0
4
2
0
0
6488064
FILEHEADER_PAGE
2
6900
0
10
2
0
0
6488064
FILEHEADER_PAGE
2
6884
0
1
2
0
0
7.20576E+16
IAM_PAGE
2
6
0

When I ran the below query to find out the objects these allocations are associated to I didn’t get anything.

select d.database_id,d.allocation_unit_id,
(
select object_id from sys.partitions where
partition_id=(select container_id from sys.allocation_units where allocation_unit_id=d.allocation_unit_id)
)
 from sys.dm_os_buffer_descriptors d where file_id>1

database_id
allocation_unit_id
(No column name)
7
6488064
NULL
3
6488064
NULL
4
1.92626E+14
NULL
10
2.81475E+14
60
2
2.81475E+14
60
1
7.20576E+16
NULL
5
7.20576E+16
NULL