DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Related

  • Exploring Exciting New Features in Java 17 With Examples
  • Commonly Occurring Errors in Microsoft Graph Integrations and How To Troubleshoot Them (Part 4)
  • JSON-Based Serialized LOB Pattern
  • Rust’s Ownership and Borrowing Enforce Memory Safety

Trending

  • Beyond Simple Responses: Building Truly Conversational LLM Chatbots
  • MySQL to PostgreSQL Database Migration: A Practical Case Study
  • Securing the Future: Best Practices for Privacy and Data Governance in LLMOps
  • System Coexistence: Bridging Legacy and Modern Architecture
  1. DZone
  2. Data Engineering
  3. Data
  4. NewTypes Aren't As Cool As You Think

NewTypes Aren't As Cool As You Think

By 
John De Goes user avatar
John De Goes
·
Nov. 20, 14 · Interview
Likes (0)
Comment
Save
Tweet
Share
10.2K Views

Join the DZone community and get the full member experience.

Join For Free

My last post talked about what’s wrong with type classes (in general, but also specifically in Haskell). This post generated some great feedback on Reddit, including some valid criticism that I didn’t explain why I hated on newtypes so much.

I took some of that feedback and incorporated it into a revised version of the post, but I have even more to say about “newtypes," so I decided to write another blog post.


What’s in a Newtype

Newtypes are a feature of Haskell that let you define a new type in terms of an existing type.

In the following example, I create a newtype for Email, which “holds” a String.

newtype Email = Email String

They are similar to type synonyms (type Email = String), except that type synonyms don’t create new types, they just allow you to refer to existing types by other names.

Every newtype can be easily translated into a data declaration. In fact, only the keyword changes:

data Email = Email String

There’s a slight semantic difference between the two, but for purposes of this blog post, any criticism I have against newtypes apply equally to similar constructs modeled with type or data.


The Promise of Newtypes

Newtypes are used to provide and select between alternate implementations of type classes for some base types. I think that’s a hack (albeit a necessary one), but I’ve already talked about this so I won’t belabor it here.

The other promise of newtypes is that we can use them to make our code more type safe. Instead of passing around String as an email, for example, we can create a super lightweight “wrapper” around String called Email, and make it an error to use a String wherever an Email is expected.

This practice isn’t restricted to Haskell. Even in Java, it’s considered good coding practice to wrap primitives with classes whose names denote the meaning of the wrapper (Email, SSN, Address, etc.).

There’s a part of this promise that’s certainly true. If I have to define a function accepting four parameters, and three of them are strings, but one of those strings denotes an email, then I have two choices:

  1. Model the email parameter with a String. In this case, I may accidentally use the email where I intended to use the other two string parameters, or I may use one of the other two string parameters where I intended to use the email. Considering just these choices, there are five ways my program may go wrong if I use the wrong name in the wrong position.
  2. Model the email parameter with a newtype. In this case, I cannot use the email where I intended to use the other two string parameters, because the compiler may stop me. Similarly, I cannot use the other two string parameters where I intended to use the email, for the same reason. Looking at just these choices, there are 0 ways my program may go wrong.

Thus, newtypes, like all good FP practices, reduce the number of ways my program can go wrong.

Unfortunately, in my opinion, they don’t go nearly far enough.


False Security

For most intent and purposes, newtypes are isomorphic to the single value they hold.

In my preceding example, given a String, I can get an email (Email "foo"). Given an Email, I can also get a String, e.g. by pattern matching on the Email constructor.

Stated differently, and also approximately because I’m ignoring bottom: the String and Email types are isomorphic; they contain the same inhabitants, for any useful definition of “same”. The only substantive difference between the preceding String and Email is the name of the data constructor (call Email an AbergrackleFoozyWatzit, and what has changed?).

Hence, my previous criticism of newtypes as “programming by name”.

By themselves, newtypes don’t really reduce the number of ways my program can go wrong. They just make it a bit harder to go wrong. But any newtype is isomorphic to the value it holds, and it’s trivial to convert between the two.

In fact, if my code doesn’t need to convert between the two (either directly or indirectly), then it’s better off generic. That is, if I never need to convert an Email to a String, or a String to an Email, then I should really write the code generically to work with any value (even if that means making data structures or functions more polymorphic).

Parametricity provides a massive reduction in the number of ways my program can go wrong. Newtypes, on the other hand, just make it a bit harder to go wrong, by adding one layer of indirection.

In this example, as with many newtypes, I’ve created a bad isomorphism. The domain model of an email is not isomorphic to the data model of a string. But by using a newtype, I have implicitly declared that they are isomorphic.

Calling a string an email may make me feel better, because of the different name, but fundamentally, with a newtype, it’s still a string, and I’m only ever one more step away from going wrong.

In my experience, too many newtypes create an isomorphism between things that, properly modeled, are not isomorphic.

Fortunately, there’s a well-worn workaround that lets us get more mileage out of newtypes.


Smart Constructors

If I define Email in a module, I can make its data constructor private, and export a helper function to construct an Email. Such helper functions are called smart constructors.

They can be used to break the natural isomorphisms created by newtyping.

An example is shown below:

newtype Email = MkEmail String

mkEmail :: String -> Maybe Email
mkEmail s = ...

In this example, I create a smart constructor which does not promise that it can turn every string into an email. It promises only that it might be able to turn a string into an email, by returning a Maybe Email.

With the smart constructor approach, I’ve modeled the fact that while every email has a string representation, not every string has an email representation.

Going back to my earlier example of passing same-typed parameters to a function, if I use a smart constructor, then while I can still use an email anywhere a string is expected (by converting), I can’t use a string anywhere an email is expected. (Well, ignoring the fromJust abomination!)


Smart Constructors, Dumb Data

Smart constructors take us one step closer toward modeling data in a type safe fashion.

Unfortunately, I still don’t think it’s far enough.

With smart constructors, our data model is fundamentally underconstrained, so we patch that up by restricting who can create the data. That’s putting a band-aid on the real problem.

Why not just solve the root issue — viz., that our data model is underconstrained?


Dumb Constructors, Smart Data

The best solution to a great many newtype problems, I believe, is creating a data model where there is a true isomorphism between the entity modeled by our data and the values passed to the data constructor.

That is, creating a data model such that there exists no regions in our data’s state space which correspond to invalid states.

Email is a simple example, because there are well-defined models for what constitutes a data model, which can be translated into data declarations in straightforward, if tedious fashion.

(To some extent, it’s a failure of most languages I know that such specifications cannot be easily translated into code without tedious boilerplate!)

When our data declaration precisely fits our data model specification, there’s no need for smart constructors, and no need for newtypes. There’s far fewer ways that code can go wrong, and because our domain model is captured precisely by our data model, we can transform that data model in ways that make semantic sense (e.g. transforming just the name part of an email, since we’re now in the realm of structured data).


Summary

As I’ve explained in this blog post, I don’t really hate newtypes. I think they’re very useful, and I do use them, because they make it more difficult for my programs to go wrong.

Ultimately, however, I think a lot of problems solved with newtypes (modeling coordinates, positions, emails, etc.) are better solved by more precise data modeling. That is, by making our programs stop lying about isomorphisms.

Precision may be tedious due to limitations of the languages we work in, but honestly, what’s more tedious than debugging broken code?

Data (computing) Data model (GIS) Strings Data Types

Opinions expressed by DZone contributors are their own.

Related

  • Exploring Exciting New Features in Java 17 With Examples
  • Commonly Occurring Errors in Microsoft Graph Integrations and How To Troubleshoot Them (Part 4)
  • JSON-Based Serialized LOB Pattern
  • Rust’s Ownership and Borrowing Enforce Memory Safety

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!