RSS feed [root] /weblog /database




login:

password:

title search:




 


Thu Apr 06 06:32:46 GMT 2023

database



(google search) (amazon search)
second
download zip of files only

Fri May 03 13:19:43 GMT 2024 From /weblog/database/vendor

redshift


All insights from Redshift academic paper: Amazon Redshift re-invented in 2022 - https://blog.det.life[..]redshift-heres-what-i-found-85c31a59fd19

(google search) (amazon search)


Sun Mar 17 13:03:38 GMT 2024 From /weblog/database/transaction

transaction


Transaction, same as all the concept in computing, is an approach and trade-off - http://www.allthingsdistributed.com[..]d.com/2007/12/eventually_consistent.html

Transaction file system - http://www.infoq.com/news/2008/01/file-systems-transactions http://myjavatricks.com/jtfs.aspx

Explanation about database isolation level - http://highscalability.com[..]-and-their-effects-on-performance-a.html https://begriffs.com[..]08-01-practical-guide-sql-isolation.html

A nice article explain about transaction process - http://www.theserverside.com[..]Nuts-and-Bolts-of-Transaction-Processing

http://vladmihalcea.com[..]guide-to-acid-and-database-transactions/

SQL Transactions and ACID Properties - https://levelup.gitconnected.com[..]actions-and-acid-properties-bb5b670538f8

(google search) (amazon search)


Wed Feb 01 02:37:46 GMT 2023 From /weblog/database/performance

Note of SQL performance tuning


Note of this book - http://www.ocelot.ca/tuning.htm

1) Order of "and" and "or" is matter, if no cost-base optimizer

where column1 = 'a' and column2 = 'b' -> where column2 = 'b' and column1 = 'a' if column2 'b' is less likely

2) <> is more expensive than =

3) Can force to use index if SELECT * FROM table WHERE indexed_column > 0

4) UPPER(column) might lose information, use LOWER is better

5) WHERE column = 'WORD' or column = 'word' is always faster than WHERE LOWER(column) = 'word' , even faster version is test if database if case-insensitive: WHERE column = 'WORD' or ('WORD' <> 'word' AND column = 'word')

6) 32bit integer is the fastest datatype for 32bit OS

7) column in (?, ?) is faster than column =? or column = ? for some database, and don't slow in any database

8) For aggregate functions, prevent using having, try to use fewer columns in group by and use subquery if possible

9) In fact, count(1) is not faster than count(*)

10) CAST before aggregate function: SELECT CAST(SUM(column) AS INT) -> SELECT SUM(CAST(column as INT))

11) Use constant as much as possible, e.g.: select * from t1, t2 where t1.c1 = t2.c1 and t1.c1 = 1 -> select * from t1, t2 where t1.c1=1 andn t2.c1=1

12) Information of Join vs Subqueries

13) Information of optimum column type (VARCHAR, TIMESTAMP, WORD size INT and CHAR(1) usually better), shift effect for page

14) prevent NULL for porting issue

15) Low level table physical information

Some related links - http://www.techartifact.com[..]9/12/sql-optimization-tipsquestions.html

16) Prevent database level transaction, prevent joining, prevent locking ( like auto increment key ), pretty good suggestion - http://www.aviransplace.com/2015/08/12/mysql-is-a-great-nosql/

17) Reduce dataset for aggregation - https://www.periscopedata.com[..]-subqueries-to-count-distinct-50x-faster

Few good tips, like avoid cursors - http://www.javaworld.com[..]ent/21-rules-for-faster-sql-queries.html

7 Tips for Query Optimization in SQL Servers - https://learningdaily.dev[..]optimization-in-sql-servers-fb3653c910f1

(google search) (amazon search)


Thu Aug 11 13:31:24 GMT 2022 From /weblog/database/vendor

mysql


It is surprising for me that it is this easy - http://alan.blog-city.com/read/1204253.htm

MySQL Proxy for table partition - http://www.hscale.org/display/HSCALE/Home http://pero.blogs.aprilmayjune.org/ http://forge.mysql.com/wiki/MySQL_Proxy

NoSQL way to access data in MySQL - http://yoshinorimatsunobu.blogspot.com[..]0/10/using-mysql-as-nosql-story-for.html

The lock level of mysql - http://sgdev-blog.blogspot.com.au[..]ql-transaction-isolation-levels-and.html

MySQL rewriteBatchedStatements configuration property - https://vladmihalcea.com/mysql-rewritebatchedstatements

(google search) (amazon search)


Tue Aug 02 13:18:23 GMT 2022 From /weblog/database

connection


Maximum number of database connections - https://vladmihalcea.com/maximum-database-connections/

(google search) (amazon search)



Thu Jun 24 14:57:56 GMT 2021 From /weblog/database/performance

index


Case study about how to find out suitable index - http://samsaffron.com[..]e+life+of+a+slow+page+at+Stack+Overflow#

A simple example of how to use EXPLAIN to find out the bottleneck of the query and add suitable index to suitable column - http://hackmysql.com/case4

Bitmap Index vs. B-tree Index: Which and When? - http://www.oracle.com/technology/pub/articles/sharma_indexes.html http://publib.boulder.ibm.com[..]/v5r3/index.jsp?topic=/rzajq/perf21c.htm

script to check if index is missing at foreign key

select table_name, constraint_name,
cname1 || nvl2(cname2,','||cname2,null) ||
nvl2(cname3,','||cname3,null) || nvl2(cname4,','||cname4,null) ||
nvl2(cname5,','||cname5,null) || nvl2(cname6,','||cname6,null) ||
nvl2(cname7,','||cname7,null) || nvl2(cname8,','||cname8,null)
columns
from ( select b.table_name,
b.constraint_name,
max(decode( position, 1, column_name, null )) cname1,
max(decode( position, 2, column_name, null )) cname2,
max(decode( position, 3, column_name, null )) cname3,
max(decode( position, 4, column_name, null )) cname4,
max(decode( position, 5, column_name, null )) cname5,
max(decode( position, 6, column_name, null )) cname6,
max(decode( position, 7, column_name, null )) cname7,
max(decode( position, 8, column_name, null )) cname8,
count(*) col_cnt
from (select substr(table_name,1,30) table_name,
substr(constraint_name,1,30) constraint_name,
substr(column_name,1,30) column_name,
position
from sys.user_cons_columns ) a,
sys.user_constraints b
where a.constraint_name = b.constraint_name
and b.constraint_type = 'R'
group by b.table_name, b.constraint_name
) cons
where col_cnt > ALL
( select count(*)
from sys.user_ind_columns i
where i.table_name = cons.table_name
and i.column_name in (cname1, cname2, cname3, cname4,
cname5, cname6, cname7, cname8 )
and i.column_position <= cons.col_cnt
group by i.index_name
)

http://chriswongdevblog.blogspot.com.au[..]01/dont-forget-to-index-your-oracle.html

Explain how postgres indexes work, look like combine index is not that useful - http://dublintech.blogspot.com.au/2015/07/postgres-indexes.html

SQL Server JDBC – Set sendStringParametersAsUnicode to false - https://vladmihalcea.com[..]rver-jdbc-sendstringparametersasunicode/

(google search) (amazon search)


Mon Jun 21 13:42:47 GMT 2021 From /weblog/database

lock


Look like using "select XXX for update" is widely supported

http://www.unix.org.ua/orelly/oracle/prog2/ch06_11.htm
http://www.google.com/search?q=select+FOR+UPDATE

BTW, someone suggest using it probably ok for using (nolock) at all - http://www.codinghorror.com/blog/archives/001166.html

Lock in mysql - https://www.infoq.cn/article/ZAu0eWzsdTX9zoFr6C8W

(google search) (amazon search)


Tue Oct 15 04:06:14 GMT 2019 From /weblog/database/vendor

misc


Few distributed SQL DB vendor information - http://www.infoq.com/news/2013/11/sql-newsql-nosql

Approximation DB - http://highscalability.com[..]e-200x-faster-without-having-to-pay.html

Discuss and compare difference technology of querying journey(log) data - http://www.infoq.com[..]on-log-and-data-storage-query-techniques

SAP HANA - https://www.google.com/search?q=sap+hana

(google search) (amazon search)


Mon Jun 17 00:30:14 GMT 2019 From /weblog/database/sql

join


http://www.codeproject.com[..]33052/Visual-Representation-of-SQL-Joins http://nosql.mypopescu.com/post/51770099382/sql-joins-visualized

With IN and subquery / or "top" like keywords - http://javarevisited.blogspot.com.au[..]econd-highest-or-maximum-salary-sql.html

Top 5 Online Courses to Learn SQL and Database for Beginners - Best of Lot
- https://dev.to[..]-database-for-beginners-best-of-lot-5533

(google search) (amazon search)


Sat Mar 31 08:14:54 GMT 2018 From /weblog/database/vendor

mssql


SQL 2005 JDBC fix the resultset re-read throws exception problem. However, you need to use executeUpdate() for insert,update and delete instead of execute() for SQL 2005 JDBC, otherwise will throw exception on this.

Other that, statement.executeBatch() will throws exception: http://lab.msdn.microsoft.com[..]kid=e13a8009-7466-4803-ba19-6bfd4b5f8966

New function of SQL server 2014, In-Memory Optimization tables, suppose to be faster? - http://www.infoq.com/news/2013/09/MOT-Indexes MOT table also support native compiled query - http://www.infoq.com/news/2013/09/Compiled-Queries , and lock free write? http://www.infoq.com/news/2013/09/Lock-Free-Writes

From microsoft, Use the columnstore index to achieve up to 10x query performance gains over traditional row-oriented storage, and up to 7x data compression over the uncompressed data size. - http://www.infoq.com/news/2013/09/Clustered-Columnstore

A tool from M$ to help sql server user to manage change of database: http://msdn.microsoft.com/vstudio/teamsystem/products/dbpro/ . Other similar product is DBGhost, also only work for SQL server http://www.innovartis.co.uk/home.aspx

There is a free version of SQL server 2005 call SQL server express, which just like MSDE http://www.microsoft.com/sql/editions/express/redistregister.mspx

Here is a comparison: http://www.sqlmag.com[..]le/ArticleID/49618/sql_server_49618.html , may be we can use that instead of MSDE 2005 and discoutinue support of SQL server 2000 so that we can test one thing less?

Import CSV to SQL server - http://www.sqlservercurry.com[..]oad-comma-delimited-file-csv-in-sql.html

Shriking DB cause performance problem... not sure how long this finding is hold - http://blog.sqlauthority.com[..]eases-fragmentation-reduces-performance/

http://beyondrelational.com[..]ver-storage-internals-part-1-basics.aspx

Get the metadata - http://gigi.nullneuron.net[..]-metadata-from-sql-server-catalog-views/ https://simpleprogrammer.com[..]icrosoft-sql-server-metadata-developers/

Date functions - http://www.essentialsql.com[..]roduction-to-sql-servers-date-functions/

Enhancement of 2016 - https://www.infoq.com/news/2016/06/SQl-Server-2016-Performance

Tool to rescue sql server - https://www.infoq.com/news/2017/02/SQL-First-Responder https://github.com/BrentOzarULTD/SQL-Server-First-Responder-Kit http://firstresponderkit.org/

Prevent locking with NOLOCK / READPAST - https://www.techrepublic.com[..]-and-readpast-table-hints-in-sql-server/

(google search) (amazon search)


Fri Nov 03 01:41:42 GMT 2017 From /weblog/database/performance

numa


https://technet.microsoft.com[..]m/en-us/library/ms345403(v=sql.105).aspx

https://code.kx.com/q/cookbook/linux-production/

http://jcole.us[..]swap-insanity-and-the-numa-architecture/

(google search) (amazon search)


Fri Sep 22 07:29:28 GMT 2017 From /weblog/database/performance

architecture


IT Hare: Ultimate DB Heresy: Single Modifying DB Connection. Part I. Performanc - http://highscalability.com[..]y-single-modifying-db-connection-pa.html

How yandex.metrica improve DB reading - http://highscalability.com[..]of-data-structures-in-yandexmetrica.html

(google search) (amazon search)


Fri Sep 08 16:15:42 GMT 2017 From /weblog/database

uuid


https://segment.com/blog/a-brief-history-of-the-uuid/
http://www.infoq.com/cn/articles/talk-about-the-history-of-uuid

If using uuid as PK is good idea? - https://tomharrisonjr.com[..]-as-primary-keys-be-careful-7b2aa3dcb439

(google search) (amazon search)


Wed Jul 26 02:57:03 GMT 2017 From /weblog/database

time series


Some information about Time Series:

http://en.wikipedia.org/wiki/S-PLUS
http://download-west.oracle.com[..]ocs/cd/A87860_01/doc/inter.817/index.htm
http://www.sas.com/
http://oracledmt.blogspot.com/2006/10/time-series-revisited.html
http://oracledmt.blogspot.com[..]1/time-series-forecasting-part-1_23.html

Discussion of time in time series database - UTC or local - http://quant.stackexchange.com[..]ime-in-time-series-database-utc-or-local

Market data:
http://cs.nyu.edu/cs/faculty/shasha/fintime.html

and a book:
http://sql-info.de[..]iented-database-applications-in-sql.html
http://www.cs.arizona.edu/people/rts/tdbbook.pdf

Should we use DATE as key or INT as key? http://quant.stackexchange.com[..]senting-time-series-implementation-level

Collections of column-oriented stores - http://quant.stackexchange.com[..]s/1392/usage-of-nosql-storage-in-finance

时序数据库深入浅出之存储篇 - http://www.infoq.com/cn/articles/storage-in-sequential-databases

深入浅出时序数据库之压缩篇 - http://www.infoq.com/cn/articles/condense-in-sequential-databases

pretreatment - http://www.infoq.com[..]les/pretreatment-in-sequential-databases

Why - http://www.infoq.com/cn/news/2017/07/Why-time-series-database

分级存储 - http://www.infoq.com[..]archical-storage-of-sequential-databases

(google search) (amazon search)


Tue Jan 03 01:58:28 GMT 2017 From /weblog/database

database comparison


Wiki of common DBMS comparison - http://en.wikipedia.org[..]rison_of_SQL_database_management_systems

Somebody comment that postgresql is less competency than MSSQL, see the comment
http://www.baubels.net/blog/?p=88

Alternative DBMS - http://www.webresourcesdepot.com[..]ternative-open-source-databases-engines/

http://blog.jooq.org[..]0-most-popular-db-engines-sql-and-nosql/

http://www.xaprb.com[..]2015/05/25/what-makes-a-solution-mature/

http://www.javacodegeeks.com/2015/07/mysql-vs-mongodb.html

Comparing major RDBMS - http://www.infoq.com[..]elect-the-appropriate-rdmbs-for-the-oltp

(google search) (amazon search)


Tue Oct 18 17:41:16 GMT 2016 From /weblog/database/vendor

pgsql


http://www.javacodegeeks.com[..]0/introduction-to-postgresql-pljava.html

PostgreSQL schema is way fast and easy to change than MySQL, although maybe slower - https://www.pgrs.net[..]grating-from-mysql-to-postgresql-slides/

Subindexes that can save you space and make updates faster by reusing an existing multi-column index.
Expression indexes that speed up queries with WHERE clauses matching a function of the columns instead of raw values.
Partial indexes that are much smaller than a complete index because they don't contain data that the app doesn't care about.
http://blog.scoutapp.com[..]l-indexing-tricks-that-surprise-everyone

Experience sharing, stored proc is better? http://www.infoq.com/cn/news/2016/10/postgres-and-financial-arch

(google search) (amazon search)


Thu Oct 13 00:33:54 GMT 2016 From /weblog/database

reading


Berkeley CS286: Implementation of Database Systems, Fall 2014‎ > ‎http://www.cs286.net/home/reading-list

Disambiguating Databases, explaining various aspects of difference datastore approach - http://queue.acm.org/detail.cfm?ref=rss&id=2696453

Basic - http://ithare.com[..]action-isolation-levels-and-concurrency/


(google search) (amazon search)


Wed Jun 22 23:23:32 GMT 2016 From /weblog/database/vendor

oracle


Some comment saying that you cannot complain oracle for non-standard setup... However, should I break anything else just to install oracle? Anyway, may be a useful resource of installing oracle cluster - http://thedailywtf.com/forums/65743/ShowPost.aspx

Oracle SQL links - http://mycodeblog.blogspot.com/2007/06/sql-cheat-sheet.html

Pure java solution to full text search - http://www.infoq.com/news/2007/10/lucene-oracle http://marceloochoa.blogspot.com[..]nning-lucene-inside-your-oracle-jvm.html http://dbprism.cvs.sourceforge.net/dbprism/ojvm/

Oracle does allow definitions of user-defined types.Please see
http://download.oracle.com[..]er.111/b28286/sql_elements001.htm#i46376

User-defined types can then be used for column definitions. See e.g.
the fourth example given here
http://download.oracle.com[..].111/b28286/statements_7002.htm#i2062833

Oracle also allows creation of a table based on one user-defined type,
i.e. an "object table".
See here
http://download.oracle.com[..].111/b28286/statements_7002.htm#i2159410

User-defined types can be changed; see
http://download.oracle.com[..].111/b28286/statements_4002.htm#i2057828
Oracle even offers an option as to whether existing data in tables
based on previous type definition is converted or not; see
http://download.oracle.com[..].111/b28286/statements_4002.htm#i2079300
There are a couple of pages in the Object-Relational Developer's Guide explaining the consequences, briefly...

Note: all links above are to Oracle 11g documentation, but many, if not all, of these features have been available since 9i.

row base permission - http://technotes.towardsjob.com[..]/oracle/virtual-private-database-oracle/

How fetch size can affect performance - http://www.skill-guru.com[..]jdbc-performance-tuning-with-fetch-size/

Recycle bin for Oracle 10g - http://mahmoudoracle.blogspot.gr[..]ecycle-bin-in-database.html#.UNXprSUqthH

Oracle won't put null in index... how can we find the column with issue out? - http://blog.jooq.org[..]12/the-index-youve-added-is-useless-why/

You can have java like stacetrace at PLSQL - http://blog.jooq.org/2014/08/29/plsql-backtraces-for-debugging/

Need to be careful about using rownum to limit the result in oracle - http://www.monkeycancode.com[..]racle-limit-returned-records-with-rownum

(google search) (amazon search)


Mon Jun 06 08:09:19 GMT 2016 From /weblog/database

database style


A pointer to various database style discussion, like [application database], [integration database], [silly database], [smart database] ...

http://www.pervasivecode.com[..]n-of-stupid-databases-being-a-good-idea/

BDD for database design

Requirements are behavior,
Provides "ubiquitous language" for analysis,
Acceptance criteria should be executable.
Design constraints should be made into executable tests.

- http://www.methodsandtools.com/archive/archive.php?id=78

http://dublintech.blogspot.hk/2016/06/agile-databases.html

(google search) (amazon search)



Fri Jan 22 03:39:27 GMT 2016 From /weblog/database/sql

grouping


There is way to drop the dummy aggregate function in group by - http://blog.jooq.org[..]onal-dependencies-a-very-useful-feature/

(google search) (amazon search)


Fri Aug 28 10:19:53 GMT 2015 From /weblog/database/vendor

sybase


If you get following exception with Sybase ASE drive:

'System.Reflection.TargetInvocationException, Msg=Exception has been thrown by the target of an invocation.
Exception details: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.TypeInitializationException: The type initializer for 'Sybase.Data.AseClient.AseConnection' threw an exception. ---> System.DllNotFoundException: Unable to load DLL 'sybdrvado115a.dll': The specified module could not be found. (Exception from HRESULT: 0x8007007E)
at ?.AseGetDriverVersion(StringBuilder A_0, Int32 A_1)
at Sybase.Data.AseClient.AseConnection.?()
at Sybase.Data.AseClient.AseConnection..cctor()
--- End of inner exception stack trace ---
at Sybase.Data.AseClient.AseConnection..ctor(String connectionString)

Please make sure the program can see sybdrvado115.dll, msvcp71.dll and msvcr71.dll in that same directory

http://www.elsasoft.org/SqlSpec.htm

===================

Query to check blocking at sybase:
select spid,suser_name(suid),cmd,status,blocked,time_blocked,physical_io from master..sysprocesses where blocked <> 0




(google search) (amazon search)


Sun Jul 05 00:25:32 GMT 2015 From /weblog/database

data


The Data Analytics Handbook - https://www.teamleada.com/handbook

(google search) (amazon search)


Tue Jun 09 09:22:45 GMT 2015 From /weblog/database

view


http://blog.jooq.org/2015/06/02/what-exactly-are-sql-views/

(google search) (amazon search)