What’s wrong with computational notebooks?

377 pointsby ashort11over 5 years ago

59 comments

azhenleyover 5 years ago

Co-author of the study here. Let me know if you have any questions or how you overcome some of the problems we identified!

评论 #22165714 未加载

评论 #22168271 未加载

评论 #22166584 未加载

评论 #22167505 未加载

评论 #22166792 未加载

评论 #22168182 未加载

评论 #22166104 未加载

评论 #22167297 未加载

评论 #22165430 未加载

评论 #22166228 未加载

评论 #22166414 未加载

moultanoover 5 years ago

I want a notebook where causality can only flow forward through the cells. I hate notebook time-loops where a variable from a deleted cell can still be in scope.1. Checkpoint the interpreter state after every cell execution.2. If I edit a cell, roll back to the previous checkpoint and let execution follow from there.I can't tell you how many times I've seen accidental persistence of dead state waste hours of people's time.

评论 #22165350 未加载

评论 #22165360 未加载

评论 #22165508 未加载

评论 #22166119 未加载

评论 #22165247 未加载

评论 #22165733 未加载

评论 #22165668 未加载

评论 #22165237 未加载

评论 #22165244 未加载

评论 #22166385 未加载

评论 #22165928 未加载

评论 #22186424 未加载

评论 #22165209 未加载

评论 #22168312 未加载

评论 #22165330 未加载

andrew_nover 5 years ago

I used Mathematica’s notebook interface quite heavily 15-20 years ago; Jupyter’s interface is a clone of that in many ways.At the time, my workflow was to use two different notebooks for everything: foo.nb and foo-scratch.nb. I’d get things working a piece at a time in foo-scratch.nb, not caring at all how it looked, not having to worry about leaving extra output or dead ends of explorations lying around; then the refined cells would be copied over to foo.nb, which would get pristine presentation, and which I could run top-to-bottom.This workflow worked pretty well for me: very clean reproducible output, with the ability to easily refer back to all the steps of how I’d derived something, along with copious detailed private notes.I never had to use it but I’m pretty sure each cell even had its modification time stored in the metadata in case I wanted to view a chronological history.

评论 #22169623 未加载

hprotagonistover 5 years ago

See also Joel Grus' talk, "I Don't Like Notebooks": <a href="https://www.youtube.com/watch?v=7jiPeIFXb6U" rel="nofollow">https://www.youtube.com/watch?v=7jiPeIFXb6U</a>slides: <a href="https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit" rel="nofollow">https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUh...</a>

评论 #22168019 未加载

rcarover 5 years ago

I'm someone who has been programming for a very long time and has been using notebooks for a reasonably long time (and almost always starts projects with them), my feeling is that they are a bit like C in that they make it easy to accidentally shoot yourself in the foot if you aren't careful. I always strive to end up with a notebook that can be "Run All" from a fresh clone, and I'd say that I'm successful with that maybe 60-70% of the time, and am close enough that I can fix it in the remainder.As the article (and the many others like it that have frequently cropped up as soon as IPython Notebooks first started ramping up in popularity) points out though, a lot of newer users don't have the discipline to ensure that they're not jumping around too much. It's not a problem for them in the immediate term since they know how the state ought to work, but then it becomes a mess when they try to share it with someone else (or to run it themselves again 3 months later).The challenge though is that the data analysis workflows that it allows are unbeatable by any other tools I've tried. In the end, it may just be that it's the worst form of data programming except for all of the others that have been tried.

bryanhpchiangover 5 years ago

I interned @ Google AI last summer; used notebooks nearly everyday. Estimated productivity gain is 3-5x.Biggest tip I have is to turn auto reload on, then write the bulk of your code as modular functions and call functions within your notebooks. Keeps the notebook tidy and it’s easier to push your code this way.It’s also easier for sharing since most people viewing your notebooks (mentors, people outside your team) are interested in results/artifacts such as metrics, generated text, images, audio, which notebooks display well (not your code).

评论 #22167595 未加载

评论 #22167602 未加载

staredover 5 years ago

(A frequent Jupyter Notebook user here. For data exploration, and teaching deep learning - then Colab is indispensable.)The main question is: what are the alternatives, for data exploration (and sharing its results). Similarly, for data science tool demos, Notebooks shine.IMHO the problem is not in the notebooks, but in how they are being used (i.e. the workflow). By writing scripts in py files, and using notebooks only to show their results (processed data, charts, etc) we get the best of both worlds.The only build-in problem with Jupyter Notebooks is JSON, mixing input and output (and making it pain to work with version control). But here RMarkdown (and a few other alternatives) work well.

评论 #22165334 未加载

评论 #22166878 未加载

评论 #22165416 未加载

comment_guyover 5 years ago

I don't get why anyone one who knows how to use an IDE would ever use a notebook, the coding experience is garbage in comparison. I understand they started as a way to get STEM kids coding quick, but now they are like a standard in data analysis and data science, with those people needing experienced devs to translate the notebook into production code. This just drives the silo walls up higher.

评论 #22166111 未加载

评论 #22167277 未加载

teekertover 5 years ago

I don't know what this document is meant to do but you will have to take my Jupyter-lab instance from my dead cold hands.I love notebooks, I work fast, line by line I execute commands and I immediately see the output (dataframes or graphs). For complex code I have an editor open (in jupyter-lab or vscode) for some functions and classes. But the main developing is done in the notebook, anything that ends in a module start in my notebooks.As a biologist that learned to program after 30 I just don't understand how you can develop data processing code without such a close handle on dataframes and without checking in graphs/visualizations if your code does what you expect. I don't see how I would do that in pure vscode of other IDEs.I also don't understand this sentence: "Once the data is loaded, it then has to be cleaned, which participants complained is a repetitive and time consuming task that involves copying and pasting code from their personal "library" of commonly used functions." What is the alternative? Not cleaning the code? And why copy and paste when you can perfectly fine have your own shareable module on the side? I guess most notebook users do some kind of hybrid development.

评论 #22169845 未加载

nlover 5 years ago

The FastAI people have been working on a lot of these issues with their NBDev too: <a href="https://www.fast.ai/2019/12/02/nbdev/" rel="nofollow">https://www.fast.ai/2019/12/02/nbdev/</a>

评论 #22169777 未加载

评论 #22165896 未加载

LifeIsBioover 5 years ago

I love jupyter notebooks. Without them, I wouldn’t have been half as productive as I was during my PhD.Here’s a post I wrote just a few weeks ago describing some of the conventions that I established for myself over the course of 5 years:<a href="https://jessimekirk.com/blog/notebook_rules/" rel="nofollow">https://jessimekirk.com/blog/notebook_rules/</a>I suspect that a lot of the conventions I describe help mitigate problems described here, some of which should be strictly or optionally enforced by the notebook instead of the user.(The site’s very much a work in progress, so expect to see odd and broken things if you go poking around.)

评论 #22168154 未加载

rabryan35over 5 years ago

No mention of <a href="https://observablehq.com" rel="nofollow">https://observablehq.com</a> notebooks? They’re the best I’ve found in the “Share and collaborate” and “As products” category. JupyterLab is still pretty great for exploratory stuff, but visualization possibilities in observable are incredible.

评论 #22165104 未加载

评论 #22166176 未加载

评论 #22167877 未加载

评论 #22170698 未加载

bobbylarrybobbyover 5 years ago

The problem of notebooks has been solved by the Python extension in Visual Studio Code (and some other editors too, although VS Code is the one I'm most familiar with).Editing an ordinary Python file, if you insert the comment "# %%", you turn everything between that comment and the next "# %%" (or the end of the file) into a code cell that can be submitted to the ipython kernel, just as in a Jupyter notebook. The editor splits into two halves, the left half your Python file and the right half the Jupyter notebook window with submitted code and formatted output (e.g., DataFrames look pretty, plots display normally, etc.). When you're done running everything, you can export the result as a Jupyter notebook. Because you're editing an ordinary Python file, standard features like version control and importing the file you're editing into other files (you cannot normally import .ipynb files IIRC) work normally.And of course since VS Code is a real editor/IDE, you can double click a file and have it open right up (no resorting to a Terminal to start your Jupyter session) and you get syntax themes, a built in Terminal, a git UI, code snippets, documentation on hover, vim mode if that's your thing, etc.The only downside I've found is that the Python extension doesn't incorporate ipython's autocomplete in its own autocompletion, but that's a small price to pay for getting to treat .py files as notebooks.

bloafover 5 years ago

So literally all of these complaints are about their particular implementations of notebooks, not the concept of computational notebooks in general, or are all computational notebooks destined to have unstable kernels?In my mind, notebooks should be married to a functional style of programming, where you use the notebook's markup to thoroughly explain and document your functions. Below your "function definition" section, you keep a "trying things out" section where you actually plug the data into your functions for debugging/visualizations. You can't shoot yourself in the foot with variables because all the work is done in your function's lexical scope. You can shoot yourself in the foot with stale function definitions, but a good notebook interface gives you the ability to clear function definitions and run groups of cells, so you can make sure you always run your functions in a group that starts with a "clear function definitions" cell.When you are done, you just cut the "trying things out" section into a second notebook which references the functions in the first and viola, you've got a very well documented library of functions, and a new work notebook where you can freely polish your visualizations/whatever.

evrydayhustlingover 5 years ago

This is a solid list. It will be even better if juxtaposed with current efforts to solve each of these problems - every DS I know is addressing at least 2-3 of these with some pet tools in their own environment. For example, we use Panel and Holoviews to make data exploration much easier. I have a feeling the ecosystem would improve faster if we had an index of (partial) solutions aligned with this problem set.One category left out of the list: testing of data pipelines (c.f. great expectations).

pottertheotterover 5 years ago

I use Jupyter Lab with Python every day. It's where I do my initial data exploration and cleaning. Jupyter Lab is not perfect, but most of these findings seem like they are more issues of inexperience with technology and programming, not computational notebooks.

wwarnerover 5 years ago

I have been heads down in jupyter for the past couple of weeks and I finally realized I just DO NOT LIKE IT AT ALL! Cracks started appearing and then suddenly there was an avalanche of disappointment.The first crack -- it's almost impossible to build a nice presentation in Jupyter, because you always have to show your code and its stderr. I imported all the TeX goodness, and it looked pretty nice, but I couldn't show the output without showing the TeX code. Importing the TeX interpreter is quite non-standard and means that my notebook doesn't play well with the public servers. I also got burned by some kind of permissions issue, so that all my charts ended up being invisible to read-only users.The second crack -- I can only look at the code from within my own jupyter server. The source is buried in a very noisy json format.The third crack -- Who wants to write code in the impoverished browser based editor provided? How many times have I deleted a closing brace that was automatically inserted incorrectly? How can I do a global search and replace?The fourth crack -- I can't test my code unless I include all the tests in the notebook!I'm complaining. I realize that I don't have anything constructive to offer, and I'm really a beginner. However, I think some of my disappointment is justified, as I think it was reasonable to assume that I could build my notebooks to be next level presentations.

评论 #22165908 未加载

评论 #22165917 未加载

评论 #22165770 未加载

评论 #22166028 未加载

评论 #22167504 未加载

tbenstover 5 years ago

I think Atom’s hydrogen and VSCode’s python are best-in-class Jupyter clients that achieve everything Jupyter Lab set out to do with more and better features. I develop scripts that function top to bottom with a notebook side-by-side that on a keyboard stroke executes code blocks from my script in the notebook.

评论 #22168206 未加载

rossdavidhover 5 years ago

I think Computational Notebooks are a great idea, and yet I have the feeling that we are in the process of seeing them overapplied. They are wonderful for certain situations, and teaching or demonstrating code to others is right in its sweet spot.I get the impression that people are creeping in the direction of trying to do everything with one tool, which sounds like it would end up in the same swamp that Eclipse went into. Sometimes, you need to use different tools for different tasks, and not everything should integrate. Just my opinion.

ospohngellertover 5 years ago

I think that all the pain points of the article are a result of not using notebooks for their purpose. In my opinion, notebooks are good for:1. POC/MVP: Showing that what you want to do will work before making a full structure. 2. Creating PDF/HTML documents with code and output. 3. Exploratory data analysis and visualization.I think many of the data scientists in the article go well beyond what a notebook is. A notebook is where you start, but should never be a production tool.

enriqutoover 5 years ago

Jupyter notebooks are great for many purposes. They have, however, two really tragic shortcomings:1. They are stored by default in stupid json files instead of plain source code with comments.2. The text editing interface inside the browser is horrific and very difficult to normalize (e.g., disable "smart" closing of parentheses, disable the capture of classic unix copy-pasting, etc).

xixixaoover 5 years ago

This is a great list, and totally matches my experience. I also agree this is solvable with tooling.A) VS Code / IDE needs to be the primary editor B) Results are not stored with source C) Export (build) allows packaging for whatever platform.Python notebooks especially also use some crazy mutable APIs. In general notebooks align with other code written by people who aren’t usually software engineers building production systems. They’re much more about getting things done, APIs and tools are less questioned, a lot of pain is swallowed because PhDs have plenty of time to write a few lines of code. I don’t want to sound disparaging towards these people, it’s just a different set of tradeoffs from writing production grade software.

评论 #22168078 未加载

archi42over 5 years ago

As a computer scientist/software engineer, please allow me the question: Why would I prefer a notebook over e.g. equivalent python script(s) in a git?I first saw jupyter notebooks when my sister (physicist, non-programmer) used it for analyzing economical data with pandas. Run-time for the full data set was half a day (and IMHO for that analysis SQL would have been better suited). I understand that as a non-programmer it looks alluring, but once the language proficiency is build up, why not use an IDE and run the code on a shell?

评论 #22168352 未加载

评论 #22167559 未加载

tardenoiseanover 5 years ago

<a href="https://datalore.io" rel="nofollow">https://datalore.io</a> has (1) a reactive Datalore kernel that solves the reproducibility problem. It recalculates the code automatically when something is changed, and recalculates only the changed and the dependent code; (2) good completion; (3) online collaboration; (4) read-only sharing; (5) publishing; (6) sensitive data can be saved in .private directory that is not exposed when the notebook is shared with read-only access

评论 #22168412 未加载

anonsivalley652over 5 years ago

There are these and other problems with CNs:0. They try to be "be-all, end-all" proprietary container documents, so they lack generality, compatibility and embeddability. It would be better if live code try-out snippets were self-contained and embeddable in other documents: HTML, other software, maybe PDF, LaTex or literate programming formats. Maybe there should be standard, versioned interpreters for each kind of programming language in WebAssembly and cached for offline usage by the browser for inclusion in documentation, papers, etc.?1. For prototyping, it is better to have try-out live code (and/or REPLs with undo) for prototyping like what is Xcode/iOS Playgrounds for Swift or ReInteract was for Python.2. Computational notebook software, that I've seen, are terrible, complex, fragile and messy to install. The ones I've seen make TeXLive look effortless by comparison.3. Beyond replicability what goal(s) are CN really trying to solve?3.0. For replicability itself, why not have a GitLab/BitBucket/GitHub repo for code and a Docker/Vagrant container one-liner that grabs the latest source when built? Without a clear, consistent and simple build process, there is no replicability, only wasted time, headaches and fragile/messy results.3.1. Are CNs "hammers" for "nails" that don't exist?

评论 #22166492 未加载

omarhaneefover 5 years ago

Good list.Their observations bring to mind the benefits of watching people program on YouTube or video where you learn a style of working you may not even have considered.However there is one other issue that is not on the list: because a notebook is meant to be read or shared, I always feel like my work is public and feel less inclined to play around and just take a look at things. When I do “transfer” my work to a notebook, it’s only surprising or interesting things that suppress the discovery process.

ivan_ahover 5 years ago

One thing that I find to be incredibly useful is the keyboard shortcut `00` (press zero twice while focus is outside of a cell), which will restart the kernel, clear all output and re-run the whole notebook.This way I'm sure that "library code" that I'm editing in parallel in a real text editor is up to date in the notebook and also solves the limits the confusion due to run-out-of-oder problems.The overall workflow is something like this:<pre><code> 1. explore using thing.<TAB>, thing?, and %psource thing 2. edit draft code chunk or function 3. when chunk 80% done; move it to a module and replace it with an import statement 4. press 00 to re-run everything, then GOTO step 1 </code></pre> The key to preserving sanity is step 3—as soon as the exploration phase is done, move to a real text editor (and start adding tests). Don't try to do big chunks of software development in the notebook. You wouldn't write an entire program in the REPL, would you?Sometimes I keep around the notebook as a record for failed explorations or as a "test harness" for the code, but most of the time it's throwoutable since all the useful bits have moved into a normal python module/script under version control.

评论 #22168771 未加载

breatheoftenover 5 years ago

The reality is — notebooks are and need to be developed as an app platform ...In order to do notebooks properly — you need:1. discovery (Ideally static discovery) of all the state the notebook needs, and the bulk of state the notebook will/could manipulate during its execution. Your container needs to intercept the filesystem and the networking apis that will be invoked so that a determination of the state that results from these operations can be observed by the runtime and shimmed appropriately for reproducibility and for performance optimization2. The notebook (and the runtime inferred model of all the required inputs) needs to be repo stable — I Should be able to write a notebook app that reads from the file system on my development host, deploy it somewhere, and the runtime should take care that wherever that however that post deployment file system read is implemented matches my local development semantics3. Pplatform level dependency graph needs to exist to model re-execution requirements automatically — incorporating code changes and external stateApple could build this And “notebook-os” would be the correct conceptual framework for it ... anything less is always going to leave us severely wanting

randomsearchover 5 years ago

For those asking “what’s the alternative”, RStudio and Matlab already solved the design problem (though they could be better executed).

desmond373over 5 years ago

My main use for notebooks is a simple way to constantly hold a whole large dataset in memory. That way if I want to try some feature reduction or remove some bad result, I can just do that and not wait 10 minutes for my slow PC to rerun my import code. I feel like an easy way to do that in base python would draw me away from notebooks.

ktpsnsover 5 years ago

Obviously it's pretty hard to make general criticism of the Notebook GUI. This is especially without comparing to a specific other user interface for data scientists, such as a traditional REPL terminal, or some other command line tools?The Python world gives a good example about the sheer complexity of a notebook infrastructure. The is IPython, there is Jupyter Notebook, JupyterLab. There is even stuff like the SageMathCloud (nowadays called CoCalc) which is basically a web GUI to a VPS combining command lines and various notebooks. And hell, most of these web based interfaces try to make sharing easy.Mabye we should start comparing these (mostly OSS) tools to the traditional notebook GUIs of Matlab and Mathematica, something we used in the 90s and 2000s. From my feeling, they were more robust, could handle large data better, but they lack all the tooling we get for free in the web.

评论 #22165266 未加载

VvR-Oxover 5 years ago

I would love to see some gifted people using this info to further improve tools like nteract[0].It already eases a lot of pain you may have in comparison when setting up jupyter notebook without the knowledge of a software developer.[0]: <a href="https://nteract.io/" rel="nofollow">https://nteract.io/</a>

bart_spoonover 5 years ago

I'm not sure I understand the issue about the user repeatedly tweaking parameters for their data visualization. If anything, that is a reason notebooks are so nice. The repeated tweaks are due to the notebook format, its because that's an inherent part of the data visualization process, where the end result of a particular parameter choice is hard to predict how it will look with a given data set. So the same process would occur whether one was using a notebook or a script, but with a script it becomes much more cumbersome to actually see the result. In a notebook, the parameter tweaking for a data visualization is immediately followed by the result.I definitely agree with most of the other points though.

JosephRedfernover 5 years ago

I think it's easy to do notebooks wrong, but possible to do them right. I try and do quick prototyping in notebook cells before moving it off to a separate .py file, and avoid keeping any code that does anything other than visualisation or parameter setting inside a cell long-term. That way, if you need to run something "in production" (whatever that means in your context), you don't end up having to pick apart and re-write your code -- you just import the .py file you wrote along the way.For me, notebooks are a super handy way of visualising and sharing results during meetings, and it's difficult to imagine a more convenient alternative.

rb808over 5 years ago

I tried to encourage our team to use notebooks, however everyone prefers using PyCharm and git for sharing code. We dont have much visualization, which might be the reason, but I was surprised just how many people just hated it.

评论 #22165821 未加载

评论 #22166430 未加载

kriroover 5 years ago

Direct link to the preprint: <a href="http://web.eecs.utk.edu/~azh/pubs/Chattopadhyay2020CHI_NotebookPainpoints.pdf" rel="nofollow">http://web.eecs.utk.edu/~azh/pubs/Chattopadhyay2020CHI_Noteb...</a>Interesting study, I like the mixed-method approach. A quick glance at the industry of the participants suggests that there might be a bias towards structural data (which I think is actually acceptable as that makes up a huge chunk of the non-academic ML-Notebook work) Edit: The authors acknowledge this in the "Limitations".

boomersoonerover 5 years ago

This is akin to reviewing how well a screwdriver drives nails. Yes, it has problems. That doesn't mean it's a bad tool - you're just not using it right. Does it require discipline? Yes, but so does the screwdriver. That being said, I think jupyter specifically has some legacy issues around format, and I prefer R markdown. As much as I love pycharm, it's never going to do more than replicate the notebook experience. IMHO, the main author publishes on code UI/UX, the title seems more like click bait. Not sure why it's so upvoted.

fshover 5 years ago

I quite like the Spyder approach: Pure python code that is segmented into cells by inserting a special comment line.The cells can then be individually executed in an ipython shell, or the entire script can be run with the regular python interpreter. This makes it easy to tweak the individual parts without having to re-run everything. In contrast to jupyter notebooks you still end up with a valid python script that can be easily version controlled.I just wish that I could use vim instead of the Spyder IDE.

Evidloover 5 years ago

I wrote a plugin for ipython that some people might find useful: <a href="https://github.com/uiuc-sine/ipython-cells" rel="nofollow">https://github.com/uiuc-sine/ipython-cells</a>It lets you do linear execution of blocks like in Jupyter, but in a normal .py file. Obviously more lightweight than Jupyter and you get to use your regular editor.

sunadenover 5 years ago

I work at <a href="https://www.deepnote.com/" rel="nofollow">https://www.deepnote.com/</a>, we are trying to tackle some of the pains mentioned in the article (setup, collaboration, IDE features like auto-complete or linting).We are still early access, but if you are interested in an invite just let me know. My email is filip at deepnote dot com.

评论 #22168519 未加载

kriroover 5 years ago

I'm curious. How do people protocol their experiments? When I started, I used to just keep the cells but that lead to very long and impossible to parse Jupyter notebooks. I have since opted for keeping a journal.txt file in Atom where I write down hpyerparameter configurations, epochs run and results (for ML). But that feels a bit awkward as well.

dkleissasover 5 years ago

At Gigantum, we're trying to solve some of these issues too. A Gigantum Project lets you run Jupyter or RStudio in a container that is managed for you. Everything is automatically versioned so you can sort out exactly what was run, by who, and when.<a href="https://gigantum.com" rel="nofollow">https://gigantum.com</a>

thiagomgdover 5 years ago

Small note: why post an image with the pain points if I need to check the list below to understand what's written?

tdhtttover 5 years ago

Previous discussion:General: <a href="https://news.ycombinator.com/item?id=18336202" rel="nofollow">https://news.ycombinator.com/item?id=18336202</a>Version Control: <a href="https://news.ycombinator.com/item?id=21661013" rel="nofollow">https://news.ycombinator.com/item?id=21661013</a>

Rainymoodover 5 years ago

People love jupyter notebooks for the same reason people love Excel.This is intended to be a Zen-like Koan, so take it as you will.

zneveuover 5 years ago

One idea for a pain point not mentioned: better variable persistence. If I declare a variable, then delete the cell I declared it in, the variable persists. I've had this cause issues because if I use the deleted variable by accident, it will work fine right up until a kernel restart.

commandlinefanover 5 years ago

I don’t get the popularity of these things - I think you have to have started out with them to like them.

评论 #22165873 未加载

ageofwantover 5 years ago

For emacs org-mode users this <a href="https://github.com/dzop/emacs-jupyter/blob/master/README.org" rel="nofollow">https://github.com/dzop/emacs-jupyter/blob/master/README.org</a> is worth looking into.

procrastinatusover 5 years ago

I’m surprised no one has mentioned what I see as the biggest failings of notebooks: poor handling of connection loss / re-connection. The kernel will continue to run, but a connection hiccup will often make the notebook UI stop updating (and lose any kernel output).

tastymineralsover 5 years ago

Notebooks are bad and unreliable. You are repeating your code all the time, you are limited to work with smaller datasets. If you are into visual data analysis use Orange or other similar data mining tools. We allow usage of notebooks only for presentation purposes.

cwyersover 5 years ago

It is interesting to me how this talks about "computational notebooks" but it seems to be about Jupyter and derivatives thereof -- RMarkdown notebooks run inside of the RStudio IDE, and they don't use the term 'kernels' like Jupyter does.

fulafelover 5 years ago

What are the currently available CI options for notebooks? You'd think this would be one of the first tools people would need to make sure notebooks are reproducible, but there seems to be little sign of CI usage.

评论 #22167007 未加载

kdamicaover 5 years ago

Streamlit is imo the best alternative. I was a beta tester and I found that it encouraged good coding practice without sacrificing too much functionality. I highly recommend that other data scientists check it out.Streamlit.io

wrnrover 5 years ago

Trying my best to solve some of these: <a href="https://github.com/wrnrlr/foxtrot" rel="nofollow">https://github.com/wrnrlr/foxtrot</a>

voldacarover 5 years ago

Performance is another pain point, at least for jupyter

juskreyover 5 years ago

Why no Mathematica?

评论 #22167084 未加载

评论 #22168646 未加载

eanzenbergover 5 years ago

Notebooks are sort of like democracy, its the worst form of government except all the others.You need to pick the best tool for the job, and often times in machine learning that tool is a notebook.

jupp0rover 5 years ago

Why not teach data scientists how to write software effectively? Those are smart people, it’s not like using version control, writing unit tests and extracting common code into libraries is rocket science.

ngcc_hkover 5 years ago

What is wrong with life? Many but let us appreciate how to use it more instead of seemingly criticise it.The world is so much better with you alive. So is the founded tool of computational notebook. Not sure I read it covered R notebook which is really good to share info and analyst. Just wonder how to use it better.Of course they can always improve on it. But I would promote more expansion - How about a lisp notebook, a clojure notebook, a js notebook and a forth notebook.The real problem is can you have oo notebook ... it is more “serial” and graphic and data. But not for the “messy” class or trigger Based system. Hence if I may, the real problem is the scoping. It is so hard to visualise a live oo system. Unlike a live functional or even a stack based system.It is not life that is the problem. Even useless life has its use, as long as it is alive. But if it is not reaching there an alternative may have to think about. Just like we cannot be there we send in our voyagers outside solar system.Be long and prosper.

59 comments

azhenleyover 5 years ago

Co-author of the study here. Let me know if you have any questions or how you overcome some of the problems we identified!

评论 #22165714 未加载

评论 #22168271 未加载

评论 #22166584 未加载

评论 #22167505 未加载

评论 #22166792 未加载

评论 #22168182 未加载

评论 #22166104 未加载

评论 #22167297 未加载

评论 #22165430 未加载

评论 #22166228 未加载

评论 #22166414 未加载

moultanoover 5 years ago

评论 #22165350 未加载

评论 #22165360 未加载

评论 #22165508 未加载

评论 #22166119 未加载

评论 #22165247 未加载

评论 #22165733 未加载

评论 #22165668 未加载

评论 #22165237 未加载

评论 #22165244 未加载

评论 #22166385 未加载

评论 #22165928 未加载

评论 #22186424 未加载

评论 #22165209 未加载

评论 #22168312 未加载

评论 #22165330 未加载

andrew_nover 5 years ago

评论 #22169623 未加载

hprotagonistover 5 years ago

评论 #22168019 未加载

rcarover 5 years ago

bryanhpchiangover 5 years ago

评论 #22167595 未加载

评论 #22167602 未加载

staredover 5 years ago

评论 #22165334 未加载

评论 #22166878 未加载

评论 #22165416 未加载

comment_guyover 5 years ago

评论 #22166111 未加载

评论 #22167277 未加载

teekertover 5 years ago

评论 #22169845 未加载

nlover 5 years ago

The FastAI people have been working on a lot of these issues with their NBDev too: <a href="https://www.fast.ai/2019/12/02/nbdev/" rel="nofollow">https://www.fast.ai/2019/12/02/nbdev/</a>

评论 #22169777 未加载

评论 #22165896 未加载

LifeIsBioover 5 years ago

评论 #22168154 未加载

rabryan35over 5 years ago

评论 #22165104 未加载

评论 #22166176 未加载

评论 #22167877 未加载

评论 #22170698 未加载

bobbylarrybobbyover 5 years ago

bloafover 5 years ago

evrydayhustlingover 5 years ago

pottertheotterover 5 years ago

wwarnerover 5 years ago

评论 #22165908 未加载

评论 #22165917 未加载

评论 #22165770 未加载

评论 #22166028 未加载

评论 #22167504 未加载

tbenstover 5 years ago

评论 #22168206 未加载

rossdavidhover 5 years ago

ospohngellertover 5 years ago

enriqutoover 5 years ago

xixixaoover 5 years ago

评论 #22168078 未加载

archi42over 5 years ago

评论 #22168352 未加载

评论 #22167559 未加载

tardenoiseanover 5 years ago

评论 #22168412 未加载

anonsivalley652over 5 years ago

评论 #22166492 未加载

omarhaneefover 5 years ago

ivan_ahover 5 years ago

评论 #22168771 未加载

breatheoftenover 5 years ago

randomsearchover 5 years ago

For those asking “what’s the alternative”, RStudio and Matlab already solved the design problem (though they could be better executed).

desmond373over 5 years ago

ktpsnsover 5 years ago

评论 #22165266 未加载

VvR-Oxover 5 years ago

bart_spoonover 5 years ago

JosephRedfernover 5 years ago

rb808over 5 years ago

评论 #22165821 未加载

评论 #22166430 未加载

kriroover 5 years ago

boomersoonerover 5 years ago

fshover 5 years ago

Evidloover 5 years ago

sunadenover 5 years ago

评论 #22168519 未加载

kriroover 5 years ago

dkleissasover 5 years ago

thiagomgdover 5 years ago

Small note: why post an image with the pain points if I need to check the list below to understand what's written?

tdhtttover 5 years ago

Rainymoodover 5 years ago

People love jupyter notebooks for the same reason people love Excel.This is intended to be a Zen-like Koan, so take it as you will.

zneveuover 5 years ago

commandlinefanover 5 years ago

I don’t get the popularity of these things - I think you have to have started out with them to like them.

评论 #22165873 未加载

ageofwantover 5 years ago

procrastinatusover 5 years ago

tastymineralsover 5 years ago

cwyersover 5 years ago

fulafelover 5 years ago

评论 #22167007 未加载

kdamicaover 5 years ago

wrnrover 5 years ago

Trying my best to solve some of these: <a href="https://github.com/wrnrlr/foxtrot" rel="nofollow">https://github.com/wrnrlr/foxtrot</a>

voldacarover 5 years ago

Performance is another pain point, at least for jupyter

juskreyover 5 years ago

Why no Mathematica?

评论 #22167084 未加载

评论 #22168646 未加载

eanzenbergover 5 years ago

jupp0rover 5 years ago

ngcc_hkover 5 years ago