Aggregated Intelligence

Friday, May 30, 2025

PowerBi + Aws Athena + Dataflow Error: Incorrect number of arguments

A dataflow that I had suddenly stopped work.

The error was

Encountered user gateway exception: '<ccon>ODBC: ERROR [HY000] Incorrect number of arguments: </ccon>

After a lot of troubleshooting, I figured out it was being caused by some steps I had added to add additional columns. Most likely, those steps were being query-folded to the database and it was not working.

I solved it by adding a call to Table.StopFolding right before I added the steps.

Table.StopFolding - Table Function | Power Query M

Wednesday, April 09, 2025

AWS Lambda error: Sandbox.Timedout 3.00 Seconds

In a AWS lambda, we suddenly started seeing this error:

{

"errorType": "Sandbox.Timedout",

"errorMessage": "RequestId: xxxxxx-xxxx-xxx-xxxxxxxxxxx Error: Task timed out after 3.00 seconds"

}

The "Sandbox.Timeout" threw me and I could not figure out where it was coming from.

Turned out, the error was being thrown by the AWS Lambda Infrastructure, because it was configured to run for only 3 seconds. This is done via the Configuration tab and editing the "General Configuration" and increasing the timeout setting.

Tuesday, April 01, 2025

PowerBI + PostGreSql + AWS-RDS

I was getting an error when trying to connect using the PostGreSQL connector in PowerBi.

The remote certificate is invalid according to the validation procedure

The basic solution is informed by this AWS post: Set up SSL/TLS client connections to Amazon RDS for SQL Server and Amazon RDS for Oracle | AWS Database Blog, but I found it didnt work exactly the way in that post (PEM file didnt work).

Luckily for us, AWS now provides a PKCS7 file. So, go to Using SSL/TLS to encrypt a connection to a DB instance or cluster - Amazon Relational Database Service and download the bundle appropriate for your AWS Region. (You can use the global bundle, but the problem with that is you will have to hit approve in the Import Wizard many times (approximately 3 for every zone)). So, using your region's cert bundle will lessen the number of clicks.

Once downloaded, you will have to open your Windows Certificate Manager (Windows >> Run >> CertMgr.msc).

In the CertMgr, click on "Trusted Root Certification Authorities" >> Certificates and then Import.

In the Import Wizard, you will have to find the P7B file you downloaded from AWS and import it. Approve the next few steps. You should now be able to connect to PostgreSql running AWS-RDS.

These steps should also work for connecting SQL Server to PostgreSql in AWS.

Wednesday, December 11, 2024

Microsoft Forms: Difference between a Form and a Quiz

In Microsoft Forms you can create a "Form" or a "Quiz". But what is the difference between the two?

In a quiz, you can specify the correct answer and the points for the answer, if correct.

Where for a Form, you cannot provide the correct answer, nor can you assign the question a score.

Also, with a quiz, you can do some cool things like enter equations for Math questions:

A quiz also has some additional settings that allow you to show answers, etc:

Saturday, August 17, 2024

Select Blinds Ac 114 remote pairing/copy code to new remote

Works for AC114-06B and 02B remotes.

Paired remote: the one that currently is programmed and working

New remote: the one you wish to program

Steps:

1. Select the channel on the paired remote that you wish to copy to new remote

2. Select the channel on the new remote

3. Press stop button once on paired remote and then press and hold until the blinds jog up and down once

4. Quickly on the new remote press up button. Blinds should jog to let you know the code got copied.

Manual: https://drive.google.com/file/d/1mcOqt3M_EZQjjxoddzVfIvbzJvrlqGQ3/view?usp=drivesdk

Monday, April 15, 2024

Windows 11 - Enabling Hibernate

If Hibernate is not available as an option, the following 2 commands run from PowerShell should enable it:

1. powercfg /hibernate on

2. powercfg /h /type full

The above commands need to be run from a Administrator Powershell window.

In Windows 11, you can then go to: Start >> Type: Control Panel >> Control Panel >> Power Options >> Choose what the power buttons do.

Thursday, January 18, 2024

AWS - MWAA - Customize UI Title

If you would like to customize the Airflow UI title to include some additional information, you can do so in MWAA by setting webserver.instance_name

This will update the Title bar to show the value you set as shown below:

For more info see: Customizing the UI — Airflow Documentation (apache.org)

Monday, October 23, 2023

AWS Athena - Using Merge + Iceberg tables to store only changed records

This post is based on my github page: mypublicnotes/AWS/Athena/iceberg_tracking_changes.md at master · rajrao/mypublicnotes (github.com)

Change data capture using Athena and iceberg

Many times in a datalake, you have a source, where the source doesnt provide information about which records changed. Another use case is where you have an ETL, where you have multiple tables and columns taking part in it and its traditionally difficult to track which records changed in that ETL query. This page shows you one method for being able to track those changes and insert only those records that are new or had updates. (at the end, I also show how to track deletes). The method leverages AWS Iceberg tables in Athena (Athena Engine 3) and the upsert mechanism provided via the merge-into statement.

TL;DR; Check out the merge statement used to update only those records that had changes.

Setup: A CTE for source data

I am using a CTE to simulate source data, in practice, you would typically use another Athena table as your source, or a query that brings data together from multiple tables (aka ETL), etc. A key part to this method is using a hashing function that can be used to determine when a record has changes. I use xxhas64

with cte(id, value1, value2) as
    (
    select 1,'a1','b' union all
    select 4,'raj','rao' union all
    select 2,'c2','d2' 
    )
    select *, xxhash64(from_base64(value1 || value2)) as hash from cte

Note 1: You can use murmur3 instead of xxhash64 using the following code: murmur3(to_utf8(value1 || value2)).

Note 2: Here are the other hashing functions available: https://trino.io/docs/current/functions/binary.html

Setup: Create an iceberg table

The iceberg table is your final table. This will track the data that had changes. Id is the primary key in this case, you can have more columns that are part of the primary key used for the update.

CREATE TABLE
  test_db.hash_test (
  id int,
  value1 string,
  value2 string,
  hash string,
  last_updated_on timestamp)
  LOCATION 's3://my_test_bucket/hash_test'
  TBLPROPERTIES ( 'table_type' ='ICEBERG')

The ##Merge## statement

Here is a merge statement that inserts new records and updates only when there are changes. The merge statement uses the CTE described above as its source data. You can manipulate the CTE to test various scenarios. The hash column is used to determine when to insert/update data.

MERGE INTO hash_test as tgt
USING (
    with cte(id, value1, value2, value3) as
    (
    select 1,'a1','b',100 union all
    select 4,'rao','raj',200 union all
    select 2,'c2','d2',300 
    )
    select *, xxhash64(to_utf8(concat_ws('::',coalesce(value1,'-'),coalesce(value2,'-'),coalesce(cast(value3 as varchar))))) as hash from cte
) as src
ON tgt.id = src.id
WHEN MATCHED and src.hash <> tgt.hash
    THEN UPDATE SET  
    value1 = src.value1,
    value2 = src.value2,
    hash = src.hash,
    last_updated_on = current_timestamp
WHEN NOT MATCHED 
THEN INSERT (id, value1, value2, hash, last_updated_on)
      VALUES (src.id, src.value1, src.value2, src.hash, current_timestamp)

If you need to deal with deletes, you can add as your first matched phrase one of the following options (delete, or archive):

WHEN MATCHED and src.IsDeleted = 1
  THEN DELETE

WHEN MATCHED and src.IsDeleted = 1
  THEN UPDATE SET  
    is_archived = 1,
    last_updated_on = current_timestamp

Finally some examples of queries to view the data

-- see the history of changes
select * from test_db."hash_test$history" order by made_current_at desc

-- use a snasphot_id from above as your value for xxxxx
select * from test_db.hash_test for version as of xxxxx

-- get only the latest records from the table
select * from test_db.hash_test
where last_updated_on in (select max(last_updated_on) from test_db.hash_test)
order by last_updated_on

Reference:

Testing Hashing Behavior

When hashing you need to make sure that null values are handled appropriately.

Ex: null, a, null and a, null, null should be treated as changes. If they generate the same hash, then you will miss this change. Also the hash functions need string input and hence, one needs to cast the data when its not of type string. For this reason, the computation of the hash gets complicated and I have not found a simpler solution around this.

with cte(id,note, value1, value2,value3) as
(
    select 1,null,'a1','b',1 union all
    select 4,null,'raj','rao',2 union all
    select 5,'both null',null,null,null union all
    select 6,'empty & null','',null,null union all
    select 7,'null & empty',null,'',1 union all
    select 8,'empty-empty','','',2 union all
    select 9,'str-null','a',null,3 union all
    select 10,'null-str',null,'a',4 union all
    select 100,null,'c2','d2',5 
)
select *
,concat_ws('::',coalesce(value1,'-'),coalesce(value2,'-'),coalesce(cast(value3 as varchar)))
, murmur3(to_utf8(concat_ws('::',coalesce(value1,'-'),coalesce(value2,'-'),coalesce(cast(value3 as varchar))))) as hash1
, xxhash64(to_utf8(concat_ws('::',coalesce(value1,'-'),coalesce(value2,'-'),coalesce(cast(value3 as varchar))))) as hash2
from cte
order by id

Tuesday, July 18, 2023

PowerBi/PowerQuery: Dealing with errors in Excel files

When you have errors in your excel file, they sometimes leak through and adding "Table.ReplaceErrorValues" or "Table.RemoveRowsWithErrors" doesnt really work. What I have found is to add the error fix step right after the navigation step that loads the sheet.

In the screenshot below, I have used "Table.RemoveRowsWithErrors" after the Navigation step and it fixed the error.

Sunday, May 28, 2023

Applying for US passport in 2023

We needed a new passport for our daughter as her passport expires in 5 months and 3 weeks from date of travel (country requires 6 months).

Panicking, we emailed our senators and representatives. Got a call from one of them and they advised to call passport phone number and tell them country we were travelling to needs a visa (urgent passport appointment is provided 28 days out for countries needing passport).

Time line

May 5: figured out we needed new passport. Called passport agency, was told to call back 2 weeks prior to travel.

May 6: emailed senators and representatives

May 8: got call back from one of the representatives' staff advising about calling back and telling them that we needed a visa

May 9: called passport agency and got an interview date for May 23 at Colorado office in Aurora. Lucky for us this is a 30 minute drive for us.

May 23: appointment was for 8am. Should have lined up 30 minutes early. Line was long, but efficiently managed. Had flight tickets, birth certificate (as passport was for kid and this is considered a new application and not renewal), paper work about needing visa. The entire appointment lasted less than 60 minutes. Was told to return after 2pm on 25th to pick up passport.

May 25: got passport (took 15 minutes)

Reflections:

1. Everyone we spoke to from the phone staff to the people in the Colorado passport office were extremely helpful, efficient and great to work with.

2. Didn't really need help from senator/representatives, but the help they provided telling us the provision for visa, was the breakthrough we needed.

3. Next time we will apply for passport 12 months prior to expiration, as many countries need 6 months validity on passport for travel.

4. Kids need a ds-11 and you are reapplying for a new passport (I believe until age of 16). Their passports are valid for only 5 years.