

You really need to define your requirements a little more clearly. I'd be very grateful for any advice from those that have deployed a SQL Server solution to manage similar volumes. The SQL Server 2008r2 64bit db server will be provisioned as VM from a very powerful host with access to a high performance and large capacity SAN. The expected transaction request for validate (select) and redeem (insert) is expected to peak at approximately 3,500 per minute. Will 16gb and 8CPUs be enough? The db needs to be able to return a result from the coupon instance table, keyed on a numeric barcode value in less than half a second. But then I think that maybe even that would give a partition size that is too large to allow for optimal performance? Would it be possible to partition by two keys eg by issuance event + last digit of the customer id? So the logic would be: If issuance event = 1 and last digit of customer id 4 thenĮlse if issuance event =2 and last digit of customer id 4 thenĪlso, I'm not sure of the spec of the database server that we'll need. My question is - what to use as the partition key? One obvious candidate would be by issuance event, giving approximately 6 partitions. I get the feeling that these numbers are too big for a single partition? Over a six month period we'll need to store up 360 million rows in the Coupon Instance table and up to 72 million (assuming max 20% redemption rate) in the Redemption table. Any redemption request for an invalid coupon will not reach the database because it will be validated by the POS till. We need to track the coupon instance redemption data and maintain this for 6 months, although typically a coupon is only valid for six weeks. There are 15 million customers and for each issuance event, every customer will receive 6 different coupon types, giving a total of 90 million coupon instances. The coupons are to be issued periodically, usually every six weeks although there will also be ad-hoc issuance - eg for a special event. Not necessarily, we need a ‘partition by’ clause while we use the row_number concept.I've never worked with SQL Server partitioning but I currently faced with designing a database for which the volumes probably warrant it. It is used to provide the consecutive numbers for the rows in the result set by the ‘ORDER’ clause. We use ROW_NUMBER for the paging purpose in SQL. Window functions are defined as RANK (), LEAD (), MIN (), ROUND(), MAX () and COUNT () etc The ‘partition by ‘clause is a scalar subquery. We use window functions to operate the partition separately and recalculate the data subset. Things that need to be considered in the topic ‘partition by’ in sql are the ‘partition by ‘clause is used along with the sub clause ‘over’. SUM(customer_order_amount) OVER(PARTITION BY Customer_city) AS Totalcustomer_OrderAmount MIN(customer_order_amount) OVER(PARTITION BY Customer_city) AS Mincustomer_OrderAmount, Now let’s get the partition by applied for the above table: Example #1ĬOUNT(customer_orderofcount) OVER(PARTITION BY Customer_city) AS customer_CountOfOrders,ĪVG(customer_order_amount) OVER(PARTITION BY Customer_city) AS Avgcustomer_OrderAmount from CUSTOMER_ORDER_DATA ĪVG(customer_order_amount) OVER(PARTITION BY Customer_city) AS Avgcustomer_OrderAmount, Insert into customer_order_data values ('sam','agile',560,3), Given below are the examples of PARTITION BY in SQL:īelow commands to insert the data into the tables. SELECT *, ROW_NUMBER() OVER (PARTITION BY state ORDER BY state) AS Row_Number SELECT *, ROW_NUMBER() OVER (ORDER BY state) AS Row_Number ROW_NUMBER() OVER( PARTITION BY exp1,exp2. Not necessarily, we need a ‘partition by’ clause while we use the row_number concept. ROUND (AVG(loan_amount) OVER ( PARTITION BY loan_status )) AS avg_loan_amount FROM ROUND (AVG(Amount) OVER ( PARTITION BY Trip_no )) AS avg_trip_amountīelow the partition by condition is applied as per the ‘loan_amount’ partitioning by the ‘loan_no’.

Here once we apply the ‘PARTITION BY’ join we get the common rows between two tables.īelow is the partition by condition applied as per the ‘amount’ partitioning by the ‘trip_no’. Let us take two tables as below and implementationīelow are “Package” and “Loan” table. We can’t refer to the alias in the select statement. The expression_1, expression_2 only refer to the columns by ‘FROM’ clause. Here expression, we can use two or more columns to partition the data result set.

Hadoop, Data Science, Statistics & others
