SQL Help
I don't think we have a thread
Reply
Much better, cheers both!
Have you tried using Oracle instead?
Ok, this one's fucking me off.

Scenario: Producing a report of clients that have not paid their bloody invoices, with outstanding amounts split up into 30 day balance, 60 day balance etc.

Problem: Everything works fine, except where the client has made more than one payment (which obviously will end up being all of them) - the totals are multiplied by the number of payments.

Current SQL:
Code:
SELECT
    a.clientid, a.client_name,
    SUM(a.balance) AS total,
    SUM(a.balance30) AS total30,
    SUM(a.balance60) AS total60,
    SUM(a.balance90) AS total90,
    SUM(a.balance120) AS total120,
    SUM(a.balance_over) AS total_over,
    MAX(p.datereceived) AS last_payment_date
FROM (
    SELECT
        i.clientid, c.NAME AS client_name,
        i.outstanding_total AS balance,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) <= 30 ) THEN i.outstanding_total ELSE 0 END AS balance30,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 30 AND DATEDIFF('2019-03-21', i.datepaymentdue) <= 60 ) THEN i.outstanding_total ELSE 0 END AS balance60,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 60 AND DATEDIFF('2019-03-21', i.datepaymentdue) <= 90 ) THEN i.outstanding_total ELSE 0 END AS balance90,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 90 AND DATEDIFF('2019-03-21', i.datepaymentdue) <= 120 ) THEN i.outstanding_total ELSE 0 END AS balance120,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 120 ) THEN i.outstanding_total ELSE 0 END AS balance_over
        FROM accounts_invoices i
            INNER JOIN clients c ON i.clientid=c.clientid
        WHERE
            i.outstanding_total > 0 AND i.dateinvoiced <= '2019-03-21'
    ) AS a
LEFT JOIN accounts_payments p ON a.clientid=p.clientid
GROUP BY a.clientid
ORDER BY client_name


I'm assuming it's something to do with the JOIN on accounts_payments being outside of the subquery but I can't seem to get it to work with it inside the subquery (it has to be a LEFT JOIN so that it also picks up clients that have made no payments, incidentally)

Halp!
Can you left join on (SELECT clientid, max(datereceived) from accounts_payments GROUP BY clientid) p instead of on accounts_payments?
Oh, and alias max(datereceived) as something, else it won't work.
Just managed to get the LEFT JOIN working, I needed to group by the invoice ID to get around the problem I was having moving it into the subquery before, so now it's:
Code:
SELECT
    a.clientid, a.client_name,
    SUM(a.balance) AS total,
    SUM(a.balance30) AS total30,
    SUM(a.balance60) AS total60,
    SUM(a.balance90) AS total90,
    SUM(a.balance120) AS total120,
    SUM(a.balance_over) AS total_over,
    a.last_payment_date
FROM (
    SELECT
        i.clientid, c.NAME AS client_name,
        i.outstanding_total AS balance,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) <= 30 ) THEN i.outstanding_total ELSE 0 END AS balance30,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 30 AND DATEDIFF('2019-03-21', i.datepaymentdue) <= 60 ) THEN i.outstanding_total ELSE 0 END AS balance60,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 60 AND DATEDIFF('2019-03-21', i.datepaymentdue) <= 90 ) THEN i.outstanding_total ELSE 0 END AS balance90,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 90 AND DATEDIFF('2019-03-21', i.datepaymentdue) <= 120 ) THEN i.outstanding_total ELSE 0 END AS balance120,
        CASE WHEN ( DATEDIFF('2019-03-21', i.datepaymentdue) > 120 ) THEN i.outstanding_total ELSE 0 END AS balance_over,
        MAX(p.datereceived) as last_payment_date
        FROM accounts_invoices i
            INNER JOIN clients c ON i.clientid=c.clientid
            LEFT JOIN accounts_payments p ON p.clientid=i.clientid
        WHERE
            i.outstanding_total > 0 AND i.dateinvoiced <= '2019-03-21'
        GROUP BY i.invoiceid
    ) AS a
GROUP BY a.clientid
ORDER BY client_name
Joans wrote:
Can you left join on (SELECT clientid, max(datereceived) from accounts_payments GROUP BY clientid) p instead of on accounts_payments?

Yeah, think that'd work too, but managed to rubber duck a way out of it :) Cheers though.
GazChap wrote:
Joans wrote:
Can you left join on (SELECT clientid, max(datereceived) from accounts_payments GROUP BY clientid) p instead of on accounts_payments?

Yeah, think that'd work too, but managed to rubber duck a way out of it :) Cheers though.

Your way is neater, apparently this stuff can be hard when you don't have the data in front of you.
Joans wrote:
Your way is neater, apparently this stuff can be hard when you don't have the data in front of you.

Yeah, it's hard enough when you do have the data in front of you it seems, too!
Thinking cap time!
I have a stored procedure that returns a summary table of data (it will always return data, even if it's all 0s). I would like to know when that sp returns a table full of zeroes.
My original plan was to store the results of the sp in a tmp table, and analyse that, but the summary sp already does that, so I'm stuck at the "insert exec statement cannot be nested" error.
I can't really change the underlying procedure, is there a quick and nasty way I can get at the data?
Happy I don't know SQL day!

I would Google, but I'm not entirely sure how to phrase it concisely, and I'm tired and my brain has stopped.

Let's, for the sake of argument, say I have two tables, each with 1 column:
tbl a has values 1, 5, 7, 12
tbl b has values 8, 10 , 15, 19 (totally unrelated, I might have dates in one, ints in the other, probably going to have some text in there too).

I want a results set that is:
1 8
5 10
7 15
12 19
(so ideally, all the values from tbl a in order, and then all the values in tbl b in order, there might not necessarily be the same number of values in each table)

Am I just going to have to create a tmp table, give it an identity column and cycle through each row of each table (which isn't the end of the world) or am I just missing something obvious?
Ideally you shouldn't be using row number to link two sets of data. Is there anything else that says why row 1 of table one should match with row 1 of column 2?

You can do it, just do a join on row_number - but it's making assumptions about your source data
Something like this:

select A.val,B.val
from(
SELECT val,row_number() over (order by select(0)) as row_num
FROM A)A
join
(SELECT val,row_number() over (order by select(0)) as row_num
FROM B)B
on A.row_num=B.row_num

You have to have an order by clause to use row_number but order by select (0) should work I think.
Cras wrote:
Ideally you shouldn't be using row number to link two sets of data. Is there anything else that says why row 1 of table one should match with row 1 of column 2?

You can do it, just do a join on row_number - but it's making assumptions about your source data


I'd forgotten about row_number. The sources will all be temp tables that have been populated in order, so theoretically row_number should work, but I'm not sure I can bring myself to go upstairs and test it right now.

Each table will have literally nothing else in it, so there's no other criteria for joining, I'm just trying to circumvent exporting multiple datasets and putting them together in Excel or something.
Cras wrote:
Something like this:

select A.val,B.val
from(
SELECT val,row_number() over (order by select(0)) as row_num
FROM A)A
join
(SELECT val,row_number() over (order by select(0)) as row_num
FROM B)B
on A.row_num=B.row_num

You have to have an order by clause to use row_number but order by select (0) should work I think.


Ta, I knew there'd be some batshit sql way to do it, rather than plodding through manually.
That seems to have worked perfectly. I had to do full outer joins as I've no idea which table might contain the most rows, but otherwise, exactly what I needed. Thanks again.
It's still a fugly hack, mind - but so is the majority of my output.
Fugly Hack is the name of my github account
It's the title of my résumé.
If it works...
I feel like I've asked this question a million times before, or at least a variant of the same scenario...

So, I have a table that records stock transactions, i.e. stock coming in, stock being sold, stock being transferred between locations, general adjustments etc.

It looks a little like this (some fields missed off for brevity):

Code:
+------+--------------+------------+---------------+---------------+-------------+-------------+----------------+----------------+---------------+---------------+
|  id  |  product_id  |  batch_id  |  location_id  |  reason_code  |  qty_inner  |  qty_outer  |  before_inner  |  before_outer  |  after_inner  |  after_outer  |
+------+--------------+------------+---------------+---------------+-------------+-------------+----------------+----------------+---------------+---------------+
|   1  |           1  |         1  |            1  |  received     |          0  |         20  |             0  |             0  |            0  |           20  |
|   2  |           1  |         2  |            1  |  received     |          0  |         10  |             0  |             0  |            0  |           10  |
|   3  |           1  |         1  |            1  |  xfer-out     |          0  |         -5  |             0  |            20  |            0  |           15  |
|   4  |           1  |         1  |            2  |  xfer-in      |          0  |          5  |             0  |             0  |            0  |            5  |
|   5  |           1  |         1  |            1  |  sold         |       -250  |          0  |             0  |            15  |          750  |           14  |
+------+--------------+------------+---------------+---------------+-------------+-------------+----------------+----------------+---------------+---------------+


What I want to do is create a query that pulls out the single most recent stock record for the combination of product, batch, and location.

Although you can't see it on the example table there, batch_id is unique (so product_id 2 might have batch_ids 3, 4 and 5, but not 1 and 2)

So the data I'd want out would be:

Code:
+------+--------------+------------+---------------+---------------+-------------+-------------+----------------+----------------+---------------+---------------+
|  id  |  product_id  |  batch_id  |  location_id  |  reason_code  |  qty_inner  |  qty_outer  |  before_inner  |  before_outer  |  after_inner  |  after_outer  |
+------+--------------+------------+---------------+---------------+-------------+-------------+----------------+----------------+---------------+---------------+
|   2  |           1  |         2  |            1  |  received     |          0  |         10  |             0  |             0  |            0  |           10  |
|   4  |           1  |         1  |            2  |  xfer-in      |          0  |          5  |             0  |             0  |            0  |            5  |
|   5  |           1  |         1  |            1  |  sold         |       -250  |          0  |             0  |            15  |          750  |           14  |
+------+--------------+------------+---------------+---------------+-------------+-------------+----------------+----------------+---------------+---------------+


Basically, I'm interested in the after_inner/after_outer numbers.

I've tried various GROUP BYs and subqueries and nothing seems to work quite as I need it to.
I'm assuming there's a date field in there somewhere?
So something like:
Code:
SELECT a.* from tbl a
JOIN
(SELECT max(date) maxdate, product_id, location_id, batch_id FROM tbl GROUP BY product_id, location_id, batch_id) b
on a.date = b.maxdate and a.product_id = b.product_id and a.location_id = b.location_id and a.batch_id = b.batch_id


Disclaimer - I'm on a week off, so if that's absolute crap then I apologise.
Sorry yeah, there is a date field.

That certainly looks like it's doing what I was hoping for. Cheers dude, much appreciated.

Didn't realise you could do a subquery as a join like that, but I guess it's just like joining to a temporary table??
No worries, glad it turns out I'm not a complete fraud :p

Functionally, yes, I think that would just be the same as dumping that in a temp table, and joining on that. Someone better at SQL optimisation can probably tell you which is better, although the answer is probably "it depends."
Well, it runs super fast as is so I would assume that going to the hassle of having the server create a temporary table as a separate operation would be less optimal in this case, for sure :)
SQL will build a temporary table by itself. If you watch the processlist really carefully you might see it happening.
Ain't nobody got time for that ;)
Honestly, it doesn't really matter. The optimisation engine in modern MySQL is so good that 99.9999% of the time you are better off leaving it the hell alone.
Hell yes. Database servers are smarter than you are.
Yeah, I'm quite happy to just leave it looking after itself.

Anyone on here ever set up MySQL (or even better, MariaDB) to use encryption at rest?

I mean, I get the concept and know how to enable it for particular tables or databases and what-not, but what I just cannot wrap my head around is how the hell key management works - given that you're supposed to have your keys in rotation via a separate key server, I just haven't got any clue how I'd go about setting such a key server up.
Apparently, I'm bad at SQL in May (I must have been off last year).
As usual, this feels like a stupid question, but my brain has gone blank.
Stored procedure, TSql, I want to compare two variables and if they don't match, do "something".
Ah, but either of them could be null.
I feel like I would normally just do something like
Code:
If isnull(@a, 0) != isnull(@b, 0)

Where 0 is something that a and b would never be if they had a value, but that seems a bit dangerous, as maybe something will change in the future that means they could be 0.

So then I end up with:
Code:
If @a != @b or (@a is null and @b is not null) or (@a is not null and @b is null)

And that just seems like a really long way to do it, so there must be a better way, but my brain is mush, so I'm delegating to the internet.
https://stackoverflow.com/questions/661 ... sible-null seems to think coalesce could help, or just @a is not null and @b is not null and @a <> @b
I guess it depends if both a and b are null equals a match or a no match
Thank you, this is definitely the right direction.
I think the not(coalesce (@a,@b) or @a=@b) should do the trick, but it's not quite working, so I'll try and work that out.
Thing is though, if you need to make the statement longer and clunkier to make it work then go for it.

If it's stupid and it works, it's not stupid.
Yeah, it's clunkier to write out but it's not inefficient in execution
Just to save me chasing this round in circles

Code:
If not (coalesce(@a, @b) is null or @a = @b)


Won't work if only one of @a or @b is null, right? Because you get not (false or unknown) = not unknown = unknown.

I could've just written this out the long way by now...
Page 3 of 3 [ 137 posts ]