Ask HN: How to automate collecting HAR file while user is browsing

25 pointsby royalghost4 months ago

HelloWe are facing an intermittent issue in our web application where for some users for some reasons http requests are ending in error ( 400s ) esp. during token refresh with authentication server.Normally, we would ask user to generate the HAR ( HTTP archive file ) and we inspect to find the root cause. However, at this time it is challenging to collect the HAR file manually because the error is not consistent. Sometimes it seems to goes away but suddenly appears causing bad user experience.It is also hard to add logs etc. because the token refresh happens on the client side from the browser so technically there is no traces of it on the server side.I am looking into ways to automate generating the HAR file but it seems not straightforward to do it.If anyone of you have faces similar issue in the past and find a way to add such error logging in a web service let me know. Any other thoughts and suggestions are highly appreciated.Thank you in advance.

14 comments

lolinder4 months ago

This isn't a direct answer to your question, but be very careful with asking for HAR files. They're super convenient, but if your tech support doesn't understand that HAR files are the worst kind of PII you can get in big trouble.I've seen HAR files containing Google account session tokens attached in plain text to Jira tickets. If you end up leaking those tokens your customers will not be amused.See the Okta breach:<a href="https://www.rezonate.io/blog/har-files-attack-okta-customers/" rel="nofollow">https://www.rezonate.io/blog/har-files-attack-okta-customers...</a>

smittywerben4 months ago

What was the body of the HTTP 400? You should log that. Maybe there's a refresh token grace period depending on implementation.I'd sooner be testing in a lab environment recording a pcap file on both sides to try to get the client's TLS session to break before I'd want a client's confidential credential flow sent to me. I don't like to bother people. I've always hated refresh tokens, at least OAuth's design of them. Is sending a client's decrypted MITM logs around really safer?

alp1n3_eth4 months ago

How intermittent of an issue is it? I don't think collecting client side HAR files from real customers is the way to go, even if they're willing. What happens when the next weird error shows up? More HAR files?Echoing some other suggestions, but to a different extent, increase logging in the problem areas both client-side and server-side. It might be directly related to the token refresh since it only happens there, so a great place to start is within that functionality. Log the entire connection's info to both services (front and back logging) and if users are manually submitting tickets you should be able to track them down by userID / IP in the logs.Also extend the fuzzing capabilities w/ your tests through browser (potentially could be headless, depending on the issue) automation that authenticates and uses the app "normally". Keep it on repeat using the app and when token refresh time comes see if the error pops up. Throw some extra variables in their, ensure its off the corporate network or routed through DCs farther away to see if it's a latency issue somewhere else. You could log the HAR file for this.Multiple versions of tests might need to be run in parallel with different modifiers, such as one being allowed to directly communicate w/ the origin, vs. another going through the CDN like a standard customer would.This is also an edge-case, but I've seen it popup sometimes; ensure that there aren't any other required variables that are missing during the refresh process. Sometimes specific functionality in some apps is tied to a custom header, and sometimes the value isn't updated to what the app expects. Things like that which could throw the process of from another angle.

solardev4 months ago

HAR files are big and it seems like overkill to send them every time. Can't you make just make a client side fetch to an error reporting service? i.e. if the app detects a 400, then it sends a (no auth required) payload of the failed request & response, with secrets sanitized, to another error reporting endpoint.

评论 #42827735 未加载

davidt844 months ago

As that's pretty much spying on the user, I don't think browsers make it easy to do that.

评论 #42830340 未加载

geocar4 months ago

Is this a CSP thing? Can you get away with <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Reporting-Endpoints" rel="nofollow">https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Re...</a> and window.onerror?Also, do you actually need the HAR file? or just a log of your servers' inputs/outputs from the clients' perspective? You can get that The Boring Way if you don't have a CSP issue, so maybe solve that issue?

dewey4 months ago

Might be overkill for something like this but tools like Sentry could also help you track it down more easily without any action by the customer.

phrotoma4 months ago

I think fullstory.com does this or something very like it. Not affiliated, just friends with some folks who work there.

viraptor4 months ago

> the token refresh happens on the client side from the browserYou can totally add logging for that. If you don't have an existing service that can handle it, you can create a logging-only endpoint for that purpose and send the event async to not block other work.

Zanfa4 months ago

I don’t remember how we debugged it at the time, but I’ve run into very similar symptoms that were caused by clock skew between client & servers. Increasing the validity window to both past & future by a longer period helped resolve it.

sim7c004 months ago

commendable that you wanna go this way honestly. i see a lot of companies just push bullshit back onto users in the face of this type of intermittent client side issue. repeating same dumb questions until you give up.as some other commenter said, automating har files might not be ideal as it could collect much too much info, and browsers will make this very difficult to automate.perhaps you cam add client side logging and automate gathering that or ask users for that rather than a har file. like if xyz happens again please send us log from location yzw. not sure if that is possible but it would atleast unburden users from runing devtools on an intermittent issue. if it happens only to few users you can add it optionally to their clientside like a debug/trace mode. if it happens widespread id say add it for all users.good luck and happy to see ur not giving up just yet :D these issues can be quite frustrating to get good data on. keep at it and ull find it eventually.it might also be possible to automate a client at your own side and run it until it hits the issue. no guarantee it will actually hit it though. you can run it from office, home, and try to have many colleagues / people run it in different (maybe personal) setups.

评论 #42829187 未加载

new_user_final4 months ago

I haven't used it, but you can try if it works for you. It has custom dev tools.<a href="https://eruda.liriliri.io/" rel="nofollow">https://eruda.liriliri.io/</a>

mariogintili4 months ago

can't you just do window.onerror = aFunctionThatReports400ErrorsWithAllTheDataYouNeed();

moltar4 months ago

Have you tried Sentry with replay?