[personal profile] chooseyourownlogic
 In theory I should be writing my compiler right now, but I just got a new keyboard and am typing at about quarter-speed, so this is a great chance to write down some stuff that's  been bouncing around my head lately and get used to my new keyboard, all at once. Not that this will help me with programming-style typing, but oh well.

Fair warning: this is pretty rambly! I just wanted to get these words out of my head and onto something more concrete; a more coherent version will come when I actually start working on this

It is a tragedy that Python is as popular as it is. I believe this in the usual sense, of course---its semantics are terrible---but that's not what I want to emphasize here. I think a lot of the PL community, myself included, is to blame for this. Our languages are not accessible; Python is. While I think it should be possible for most programmers to learn how to write good functional code in ML or Haskell or whatever, I also don't think it should be mandatory.

From talking to people who like Python, the feature I hear mentioned the most is how easy it is to use. While ML and Haskell are comparable to Python in  terms of the length of a Hello, world program, presumably (unfortunately, the people I'm talking  to don't also know ML or Haskell) the ease of use is diminished somewhat in more complicated programs.

We understand semantics well, yes; that's kind of our job. But programming is a populist skill, to some extent, not limited to people with the time and inclination to learn about monads and category theory and parentheses, and requiring that people sit down and learn these things, instead of presenting languages that use them in an intuitive way, is narrowminded and exclusionary. Programming may not be the most important thing in people's hierarchy of needs, but everything I've been learning and acting on lately in the rest of my life compels me, as someone who hopes to be an expert in the field, to make sure everyone can benefit from my work and the work of my colleagues, not just those of us with graduate degrees.

This is really the driving force behind this post; everything else is details. I'm sure I'm leaving things out, too---I'm using "accessible" here in a very broad sense, but it's also important to make sure languages are accessible to people with disabilities, which is an area in which I am woefully ignorant. My ultimate goal is to have a good, theoretically sound language where I could walk up to anyone in the world and say "If you want to program, you should use <foo>", and not deserve to have an angry mob chase me off. Right now, Python is, I think, the best candidate for that. Certainly all of my favorite languages (SML, Haskell, Agda), while theoretically well-founded, fail the angry mob test. This is a problem with the languages, not with the people who don't want to use them.

As someone who is relatively new to Haskell, I can see a lot of areas in it that I would never, ever want to explain to a new programmer, especially not one who is new to functional programming. In particular, the syntax for composing monadic functions, littered with <$> and <*> in the best case, is atrocious. do-notation is a step towards usability, but does not go anywhere near far enough to make it accessible. Thinking about the types of monadic code and how they compose is also a nuisance, and while I am in favor of keeping track of effects in the type system, I believe the interface to this can be vastly improved.

Lately, I have been thinking of how to fix these things. My ultimate goal is an accessible programming language with a good, ideally somewhat clean, semantics. I haven't defined "accessible" yet, and for good reason---I'm not sure what all it should encompass yet. A few criteria, though, are:

- Lightweight, unobtrusive types
- Clean syntax (no System.out.whatever nonsense, and as few reserved keywords as possible)
- Intuitive semantics. Calling effectful code should be easy. Doing effectful things should be easy. Doing all of these in a SAFE way should, of course, be easy. I should be able to write a for loop or a while loop without much effort expended. Haskell fails all of these.

Python passes all of these criteria (even the type system one---its single type is unobtrusive!), but if I were okay with python I wouldn't be writing this post, or in PL grad school, for that matter. So we also want some criteria for the semantic side of things, with design informed by PL research and experience with languages like Haskell and ML.

- Expressive module system (I come from ML land, and am therefore morally obligated to put this first)
- Expressive type system---I still believe that it is important for effects to be captured in the types of code, for example.
- Operator overloading, presumably with typeclasses
- First-class proofs of correctness. I'm not quite sure how this will fit in, if at all, but it would be nice to have dependent types of some sort baked in to the language so you could prove things about the function you just wrote, if you want to.

At first glance, these seem irreconcilable, but I believe they are not. There is already a lot of work being done to make type systems less intrusive, for example---gradual types approach this from one angle, and bidirectional typechecking from another. It should be possible for the user to customize the amount of static checking they want---a compiler can easily provide passive-aggressive warnings when things are lower than a certain threshold, but there is no reason not to provide users with flexibility there. Ideally, the benefits of types will become self-evident when people can compare the security, correctness, and speed of typed and untyped versions of the same code for themselves.

I also believe that effects can be handled much more intuitively than is currently done in haskell. Every time I have to write "do x' <- x ; case x' of ..." or "f <$> g x <*> g y <*> ..." I die a little inside. I assume someone, somewhere has done research into streamlining this, perhaps by overloading function application or doing some fancy elaboration during typechecking? I haven't looked into this at all, but it seems like an obvious avenue of inquiry.

Programming with the state and reader monads is also a nuisance due to the anonymous nature of their environments. I think we can solve two problems at once here by explicitly declaring the environments that these things refer to: this will remove the need for obtuse combinators such as put, get, and ask, while still providing the same information as the monadic types, just in a different location. For example, instead of writing

incr :: State Integer
incr = do
  i <- get
  put $ i + 1
  get

we could write

state i : Integer

  def incr () =
    i = i + 1
    return i

or something along those lines. This places an added burden on the compiler, but as long as the problem is decidable, this is exactly where the burden should be, rather than the programmer. In an ideal world we would even be able to infer the appropriate monad transformer stacks (one of the most terrifying parts of writing Haskell code), at least in a limited sense, though I don't know how feasible this is in the general case or even a restricted case. If there's no work on this out there already, though, this again seems like an important avenue for research.

Another issue is code verification. Not even Haskell does a good job on this front, since type classes are subject to extralingual laws, but in a language that permits dependent types, this can be fixed. I don't know how well these would compose, but it would be nice to have a built-in layer that allows some forms of verification from within the language itself. The compiler could provide a series of missing proof obligations that would appear in warnings, so the user can choose what level of proof they want to apply. Most people won't care, for example, about proving that their monoids are actual monoids, or that their call to foldr could be turned into a faster call to foldl, but for production code, or critical code, or whatever, it adds an extra level of assurance that no number of code reviews can provide.

Of course, to do any of this, it will take a lot of time, and a lot of libraries, especially since a lot of the effectful stuff will need to be special-cased. Having an FFI can probably help a bit on the library front, though I'm not sure what the FFI should be to---C seems like the obvious choice, but that seems like it would also encourage writing and using a ton of unsafe code, so I'm not sure where to fall on that. Libraries aren't going to write themselves, though.

Anyway, this is all very preliminary. Over the summer I hope to get started on writing some of this as a side project; it will probably proceed at a glacial pace, especially since I am not very familiar with implementing many of these things, but I think a decent first milestone would be a hybrid functional-imperative language with accessible syntax, but that elaborates into a Haskell-like pure monadic backend. We'll see. If anyone knows of any results on any of the things I mentioned, let me know; I'm guessing there's at least one grotesquely undecideable problem up there somewhere. I want to have a high-level functional language that you don't have to be a type- or category-theorist to use, though, and nothing that currently exists that I'm aware of meets these requirements. Of course, I am a type- and category-theorist, so the implementation will also be a great chance for me to learn Idris!

Date: 2013-04-18 11:18 pm (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
[preemptive tl;dr reply after the first couple paragraphs]

Earlier today I learned another new and horrible thing about Python's complete and utter lack of namespace hygiene. Namely: because modules are just dictionaries, the current module is too, which means that modules are just dictionaries stored in a local variable. This in turn means that if you use a local variable of the same name as the module you imported, it can squash the module import. There appears to be some black magic to try to hide this fact, which only makes it worse since it means top-level local variables don't cause the problem, but local variables in function definitions do!

Date: 2013-04-20 04:53 am (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
The reader monad with just ask isn't dynamic scope; it's just "global" variables. Adding local then allows you to have multiple stages of "globality"; i.e., it's a way to lie about there being a join operation on reader monads... or, well, not lie so much as bend the truth by giving local views of the global state (including the const somethingElseEntirely view).

The thing with dynamic scope is all about mutation IMO. I guess you could call the reader monad with local a form of dynamic scope, but I don't think that's really appropriate. For example, with dynamic scope you can use a loop to perform a bunch of mutation and then view the effects of that after the loop exits (so long as you don't exit the current stackframe). Can you achieve this with the reader monad? Bear in mind that every time you call local you're explicitly pushing a new "stackframe" which gets popped when the second argument returns.

You can certainly model parts of dynamic scope with the reader monad, but that's because you can model both of them by explicitly passing immutable state around. The reader monad doesn't introduce the reasoning difficulties that dynamic scope does. Even if you're trying to model DS, the details of making that model explicit will get in the way of the reasoning difficulties. There's nothing wrong with dynamic scope— except when that's not what you want.

Date: 2013-04-18 11:49 pm (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
I also believe that effects can be handled much more intuitively than is currently done in haskell. Every time I have to write "do x' <- x ; case x' of ..." or "f <$> g x <*> g y <*> ..." I die a little inside. I assume someone, somewhere has done research into streamlining this, perhaps by overloading function application or doing some fancy elaboration during typechecking? I haven't looked into this at all, but it seems like an obvious avenue of inquiry.


(1) The Habit language introduces new keywords if<- and case<- which do exactly what you want about getting rid of intermediate names there. I'm not sure why it's been so hard to get this backported into Haskell...

(2) The original paper on applicative functors also shows how to define "idiom brackets" which have the effect of overloading application. It's as unobtrusive as writing iI f (g x) (g y) Ii (if you name your brackets iI/Ii). With more liberal abilities to name things (à la Agda) we could use better names without even touching the language semantics. Unfortunately, in spite of the apparent desirability of idiom brackets, this style has never really caught on. As it turns out, this only covers a very small corner of the things we actually do with applicative functors, so the weight/power ratio is actually very bad.

(3) I'm unaware of any work on DWIM idiom brackets. However, I believe that the proper course of action here (as with monads before them) is not to add more special support for side effects, but rather to eliminate the special support for pure code. The main problem here is the imbalance between pure and effectful code. But given that purity is an effect —the identity monad— the only way to reconcile the imbalance, IMO, is to treat everything as effectful[1]. Ironically, the associative/chiastic lambda calculi I've been working on have this property. I've been ignoring effects so far, since they're not needed for the linguistic work, but that might be a fruitful avenue of research. Another place to look is DDC which has a first-class effect system, rather than repurposing the type system to track effects. Unfortunately, of course, having a first-class effect system means you need to bake in certain effects as part of the language (IO, memory regions,...) whereas others remain second-class (lists as nondeterminism, Maybe/Either as fallible computations,...)

[1] The deeper I look into things, the more I disbelieve in the existence of "pure" code in the first place. Whenever we write "pure" code we're ignoring the fact that the code has observable side effects like how much memory it allocates, how long it takes to run, etc. While many would discount these effects as being "benign", they are the source of some of Oleg's recent rants on the evils of unsafeInterleaveIO since the fact that these side-effects do exist means that they can be observed from any IO code which reflects on the state of the computation. And, in fact, we want to be able to perform such reflection since it means we can do things like choose between different "pure" algorithms based on the current memory pressure.

In addition to all this, ever since hearing a talk David MacQueen gave on one of the ugly parts of ML, I've come to believe that the very ability to give things names is itself an effect. This was something that had been nagging me for a while, but MacQueen's talk really drove it home. But if the very ability to give things names is indeed an effect, then "pure" code is anything but pure— at a deep level that noone has been willing to explore.

Date: 2013-04-20 05:29 am (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
The problem is that monads are everywhere. Everything is an effect. I think it's a bit myopic to focus only on things like state, IO, and exceptions which model the blatantly effectful structure of imperative languages. (That's how DDC started out, btw.) Continuations are a perfectly good effect, and is part of the blatantly effectful structure of Scheme et al. Nondeterminism is a perfectly good effect, and is part of the blatantly effectful structure of logic programming. Things like unification and futures are also perfectly good effects; building futures into the language was essentially the whole selling point for Oz and AliceML! And of course laziness is a perfectly good effect. Not to mention Lindsey and Ryan's work on deterministic parallelism. Nor all the recent work on memory regions. ...

The problem, of course, is that monads do not compose. That is, in general we do not have that m (n a) is isomorphic to n (m a). This is a major problem that has been known for quite a while. Things like monad transformers give us a bit of compositionality, but they provably can't go far enough. There are a few other categorical goodies people have tried using to improve the compositionality/modularity aspect of things, but none of them appears to have been especially promising in the long run.

I don't have a reference for MacQueen's talk, alas. Oleg's rants include this one about Lazy IO and unsafeInterleaveIO, which was discussed to death on reddit recently, and his followup which follows my suggestion on reddit that really the problem already exists for unsafeInterleaveST. At this point, I'm not sure how worthwhile the reddit comments will be to sift through; there's a fair bit of nuhuh-ing and handwringing which doesn't make any intellectual progress on the actual issues that Oleg brings up. I do think, though, that the thread me and Lennart Augustsson had touches on some important issues at stake in this whole line of argumentation about referential transparency (and what we mean by RT, or what we want to mean by RT, and why RT isn't the be-all-and-end-all of programming).

Date: 2013-04-19 12:06 am (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
I'm quite against your proposed way of handling state, for the same reasons I'm against ML doing things that way. For one, the types T and () -> T should always be isomorphic IMO. The only thing I'd even consider allowing to break the isomorphism is if eta fails for functions due to nontermination, as it does in Haskell, meaning that you get an extra bottom in the latter. But even that is something I think should be fixed (e.g., by actively distinguishing pointed CPOs and unpointed CPOs).

For two, one of the big problems with code like this is the exact same problem you run into in imperative languages. Failing to distinguish the side-effecting nature of reading/writing to mutables leads to all sorts of ugliness that breaks equational reasoning. This is part of the reason why coding in C is so ugly. And in order to clean that ugliness up, we already have to explicitly decompose things in order to store intermediate results in local variables; so why not just require it up front as Haskell's reader/writer/state monads do? It'd be good to make the "syntax" more lightweight than it is currently —especially re introducing spurious intermediate names— but not, IMO, by sacrificing the explicitness of those effects nor by muddying the waters with lvalues vs rvalues.

Date: 2013-04-20 06:00 am (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
I think you're missing the point of my complaints. I don't care that you want to capture effects in the type system, I'm all for that. But drawing a distinction between T and () -> T indicates a failure of the type system to properly capture information. This is one of my principal complaints against ML. Arithmetically we have that x = x^1. Category theoretically we model elements of A by the morphisms hom(1,A). Etc, etc, etc. The only times we have ever distinguished T from () -> T is when we have already failed: C-like languages which fail to treat functions as values; Haskell not modeling the equation undefined = \x. undefined; everything ML has ever done. If you want to model effects, great! then model effects; but don't break the definition of unit and arrow in order to paper over the fact that you're not actually modeling effects.

Similarly, the whole idea of needing to teach people about lvalues and rvalues indicates a failure of the language to adhere to the way people think naturally. Ask anyone who's had to teach C to undergrads. Haskell gets this right in distinguishing three things: the name of a mutable cell, the operation for reading from that cell, and the operation for writing to that cell. It's fine to complain about the syntactic overhead of how Haskell makes that tripartite distinction, and I'm all for making the distinction lighter weight. But, it is not okay to fail to make the distinctions at all, nor to conflate the cell with the name of the cell. This is one of those things that computer scientists get inured against after taking so long to learn it, but it's one of the huge stumbling blocks for hoi polloi learning how to program— and that difficulty is not because it's inherently hard, it's because our languages make it harder than it needs to be.

Date: 2013-04-20 06:06 am (UTC)
winterkoninkje: Shadowcrane (Default)
From: [personal profile] winterkoninkje
Also, I'm not sure what all this "points free" monadic code you're talking about is. do-notation is the opposite of points free, and is the dominant way people use monads IME. Those who do go for the points free approach (e.g., Tekmo on reddit) go for things like f <=< g, which is only a problem because it'd be nicer to write f . g. The applicative functor operators you're complaining about aren't points free; they're just applicative, and surely you don't want to get rid of applicative syntax; all the imperative languages have it too.

So where is all the points free code you're complaining about?

Date: 2013-04-19 12:57 am (UTC)
ext_110843: (jacked in)
From: [identity profile] oniugnip.livejournal.com
I am 100% in favor of you doing this thing. Show programmers both n00b and experienced what you can build.

I want fast, safe code and a big standard library, and I never ever want to think about monad transformer stacks or whatever nonsense y'alls spend your time thinking about.

I was thinking about what kind of nonsense I am willing to put up with, though, and I think it basically boils down to "I'm OK with dealing with the fact that this is an actual physical computer". It seems less arbitrary; there's an artifact that got manufactured, and I'm using it, and it's got limitations. That's OK. As opposed to whatever mathematical type/category business people dreamed up. I'm not sure if this is a common position, but I know it's not universal. Especially given your battle cry of "I hate computers!" -- you're coming at this from the other angle.

It's going to take some HCI, pretty sure. Make something accessible! I'd be happy to help where possible! \m/

Date: 2013-04-19 01:00 am (UTC)
ext_110843: (jacked in)
From: [identity profile] oniugnip.livejournal.com
Oh, and if it's monad transformer stacks and whatever under the hood, then great!

(especially if this will let you make it be safe and go fast and/or feel smug about how it's really monads!)

Date: 2013-04-25 12:42 am (UTC)
ext_17921: (Default)
From: [identity profile] lindseykuper.livejournal.com
I was thinking about what kind of nonsense I am willing to put up with, though, and I think it basically boils down to "I'm OK with dealing with the fact that this is an actual physical computer". It seems less arbitrary; there's an artifact that got manufactured, and I'm using it, and it's got limitations. That's OK. As opposed to whatever mathematical type/category business people dreamed up.

To me, the constraints imposed by the process of having to physically manufacture an artifact (say, under a certain budget) seem more arbitrary than the constraints imposed by mathematical abstractions. People didn't just make up types and categories. They discovered them, arguably.

This is not to say that I'm not okay with knowing that it's an actual physical computer -- that I have an arbitrary, fixed amount of parallel resources and storage space and throughput and so on to work with. I'm cool with so-called "integers" actually topping/bottoming out in value somewhere high/low enough that I don't usually have to think about it. But deciding that that's how things are is not, I think, a less arbitrary choice than deciding, say, that covariant generics are bad and wrong.

Profile

chooseyourownlogic

April 2013

S M T W T F S
 123456
78910111213
14151617 181920
21222324252627
282930    

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 2nd, 2014 11:30 am
Powered by Dreamwidth Studios