Monday, April 27, 2015

Do Pythonistas Understand Data Abstraction?

I was just reading this page on why not to use getter and setter functions in Python, and realized the author doesn't seem to have any idea what data abstraction is:

"In Java, you have to use getters and setters because using public fields gives you no opportunity to go back and change your mind later to using getters and setters."

This was never the reason I used a getter or setter method in Java! You use getters and setters to achieve data abstraction:

"The representation details are confined to only a small set of procedures that create and manipulate data, and all other access is indirectly via only these procedures."

I had a class GridAgent where the agent held its position on a grid, among other things. But then I wanted to re-do this, so that the position was held in a cell, which held the agent. I had (sometimes) done things the "Pythonic" way, but just accessing the position variable, and of course I had to find every instance of that usage and change it. Where I had done things the "evil" Java (really OOP) way, the code just worked right away.


  1. I doubt that very many people could tell you what just the word 'abstraction' by itself means. :) So probably not.

    Seriously, though, in my experience, computer guys tend towards the exact opposite of abstract thought, (i.e., they tend to the extreme of detail oriented) even though the process of abstraction seems to be pretty central to computing. It looks like (to me) they use functions, modules & the like to 'pack away' all that abstraction so it's all taken care of and safely out of view so they don't have to think about it anymore, which once you've done it, renders pretty much everything a great interconnected morass of detail. They can add layer upon layer of complexification on it without ever having to really understand how it all works together, as long as they can keep track of the details of interconnection. It's a structure that let's them focus just on the trees, which they are amazing at, and the forest takes care of itself.

    This is an environment those guys absolutely thrive in, which I think is really pretty cool!

  2. If you were previously accessing GridAgent.position, why couldn't you just define get_position() on GridAgent and put the logic in there to get the position from the cell?

    The author's point is that from a practical standpoint you can do everything with unadorned properties as you can with "private" members with getters and setters. It's not an issue of data abstraction - it's about reducing pointless boilerplate. The degree to which your design achieves data abstraction is independent of your preference for explicit versus implicit accessors.

    If you access an object's internal state from the outside you are violating that form of abstraction, regardless of if you're doing it by using direct property access or calling a getter function. Some OOP zealots think getters and particularly setters are anti-patterns for this reason, arguing that member functions ought to capture the behavior that is dependent on the data rather than simply reporting the data.

    I tend to think think that's correct as far as ideological OOP goes. But sometimes it just makes more sense to separate data from behavior instead of conflating the two. It depends on the problem being solved and the types of abstractions needed. For agent-based simulations with rich behavior and lots of state, OOP works pretty well. For transaction processing and large-scale data analysis, not so much. A lot of Java code written for those domains isn't very philosophically object-oriented, although it has the window dressing.

    1. "If you were previously accessing GridAgent.position, why couldn't you just define get_position() on GridAgent and put the logic in there to get the position from the cell?"

      Huh? I mean, yes, you could, but then you have to hunt down every place you access position, and replace it with get_position(). Which is exactly what I contend using get_position() in the first place avoids.

      I understand the complaint about getters and setters, but they at least don't imply the variable is actually in the object: just that the object knows how to access it, wherever it may live.

      "For transaction processing and large-scale data analysis, not so much."

      I built a very large option trading system with multiple real-time data feeds using OOP extensively. It worked very well.

    2. With Python's properties you can create accessor methods that provide the same interface to calling code. Existing code would still just use grid_agent_instance.position.


    3. I generally agree with Matt's theorizing here, and fall closer to the OOP zealot end. But I disagree about OOP not fitting transaction processing. If the logic gets complicated then a good OOP design can help considerably. This was true of the cell phone system I worked on in C++.

    4. "With Python's properties you can create accessor methods that provide the same interface to calling code. Existing code would still just use grid_agent_instance.position"

      This is exactly the misunderstanding in the original article I cited! The point is, we don't want ANY code accessing grid_agent_instance.position.

      And I gave the reason: in this case, I *removed* position from GridAgent completely. There is no position variable to access anymore!

    5. Hmm, Matt, I may have underestimated properties. Trying them out now.

    6. OK, Matt, I see what is going on here. I had seen intros to properties, but they had always shown how to use them to control, say, the range of a variable. I had never seen examples showing that the variable didn't even need to be there.

  3. OK, I'm confused. Do you understand that Python lets you replace the member variable `position` with an @property member method called `position()`, and that this requires NO CHANGES to the calling code, which can continue to access position as though it was a variable?

    If you do understand this, then I have to assume your objection has nothing to do with getters and setters. How would doing things the "evil" Java (really OOP) way could have helped you in this case? If you moved position from GridAgent to Cell and there is no way to compute the position from within GridAgent, then you can't possibly avoid changing code that was previously accessing GridAgent's position, regardless of if it was directly, through a getter, or any other method.

    1. See my comment above or my latest (5:22pm Thursday) post. No, I had only seen examples showing access control for a variable one actually had in that class before this week!


Distraction Deterrents in Small Contexts

"distracted from distraction by distraction" - T.S. Eliot I've been reading a little on how Facebook and other social netwo...