google-bigquery – 运行总计的BigQuery SQL

知道如何计算BigQuery SQL中的运行总数吗?

id   value   running total
--   -----   -------------
1    1       1
2    2       3
3    4       7
4    7       14
5    9       23
6    12      35
7    13      48
8    16      64
9    22      86
10   42      128
11   57      185
12   58      243
13   59      302
14   60      362

使用相关标量查询的传统SQL服务器不是问题:

SELECT a.id, a.value, (SELECT SUM(b.value)
                       FROM RunTotalTestData b
                       WHERE b.id <= a.id)
FROM   RunTotalTestData a
ORDER BY a.id;

或加入:

SELECT a.id, a.value, SUM(b.Value)
FROM   RunTotalTestData a,
       RunTotalTestData b
WHERE b.id <= a.id
GROUP BY a.id, a.value
ORDER BY a.id;

但我找不到让它在BigQuery中运行的方法……

你可能已经弄清楚了.但这是一种,而不是最有效的方式:

JOIN只能使用相等比较来完成,即不能使用b.id< = a.id. https://developers.google.com/bigquery/docs/query-reference#joins

如果你问我,这太蹩脚了.但有一项工作.只需对某些虚拟值使用相等比较来获取笛卡尔积,然后使用WHERE作为< =.这是疯狂的次优.但如果你的桌子很小,那就可以了.

SELECT a.id, SUM(a.value) as rt 
FROM RunTotalTestData a 
JOIN RunTotalTestData b ON a.dummy = b.dummy 
WHERE b.id <= a.id 
GROUP BY a.id 
ORDER BY rt

您也可以手动约束时间:

SELECT a.id, SUM(a.value) as rt 
FROM (
    SELECT id, timestamp RunTotalTestData 
    WHERE timestamp >= foo 
    AND timestamp < bar
) AS a 
JOIN (
    SELECT id, timestamp, value RunTotalTestData 
    WHERE timestamp >= foo AND timestamp < bar
) b ON a.dummy = b.dummy 
WHERE b.id <= a.id 
GROUP BY a.id 
ORDER BY rt

更新:

您不需要特殊的财产.你可以使用

SELECT 1 AS one

加入那个.

随着计费的进行,连接表计入处理.

相关文章
相关标签/搜索