Authorization: A Thought Scaffold

How do you feel about the security of your system? Are you sure of it or are you only sure that you’re probably missing something? Did you “check all the checkboxes” you found so far but new ones keep popping up? Let’s try to explore the kinds of those here, in hopes of that helping us frame our minds to think more thoroughly about security. Expect mostly questions and don’t hope for my personal answers. If I were to generalize them, I’d have to play it safe and suggest what you may call an “overkill”. I dislike making recommendations out of context. You have that context. I have the questions. Let’s go!

It’s more about what you don’t than what you do

Nothing exists in complete isolation. This is true of whatever you do secure. To be useful your “secured piece” will interact with its surroundings. Those surroundings will further interact with their surroundings, and so on. What are you doing about this?

  • You’re not concerned: nothing, it doesn’t bother you. Why are you reading this?

  • You’re an ultimate control freak: nothing ever leaves your control, not even users. The entire system is closed in, exists just for the sake of itself. Nobody else from outside knows anything about it.

  • Somewhere in between. You are doing your best to stay ahead of the attacks but accept that some risk remains.

Unless you belong to the first kind, let’s go over various aspects and granularities of security considerations. Note that some of them may fall in natural order but others overlap.

Route sensitivity

Have a look at this diagram. Suppose that you’re chiefly responsible for what is managed by the components marked as in your control. This includes the data that comes as well as validating the inputs. You will be held responsible if the data in your trust is misused. As a baseline assume that nothing outside the dashed blue box is in your control. In fact, assume that no other “box” is well managed or necessarily trusted. They don’t implement your level of security practices and may be hacked on their own. Would you trust them accessing your data? How about being able to peek into your input data? Is it OK for data in your trust to pass through those boxes on the way between your end user and systems you control? How do you ensure that trust or prevent the data going the way it shouldn’t? Do you:

  1. Cut that communication entirely?

  2. Demand adherence to your expectations?

  3. Not care as it isn’t your problem.

Authentication

Does it matter who you are communicating with? Is everyone the same if they can authenticate or their access should vary? Who do you authenticate? Do you only worry about the end user, or do you also authenticate all the components between your systems and the end user as well? If you don’t, how do trust the route? Is the end user authentication transferrable? Can another user use the stolen authentication response and impersonate the true owner? These are just some of the many questions pertaining to authentication.

Some of those challenges can be addressed by shielding an existing service with an authenticating wrapper. Wrappers like this exist, sometimes in the form of BFFs. They can help a lot but, as we continue digging deeper, we’ll see that they can’t address everything.

Data Access Specificity

What level of detail is needed in differentiating users in terms of access they have? Here are the possibilities I encountered so far:

  1. Service level: Full access to the entire service (API) is given or not at all.

  2. Entity type level: Full access to specific endpoints or all instances (and properties) of certain types of entities or not at all.

  3. Entity instance level: Full access to specific instances of entities or resources (all properties, values, …) or not at all.

  4. Property type level: Full access to all values of specific kinds of properties (e.g., personally identifiable information) of all accessible entities or resources or not at all.

  5. Property instance level: Full access to all values of instances of properties specific to instances of entities – may be accessible on entities and not on others, even if access to the entity and the kind of property are both granted. This is often the case when customers’ explicit approval is required for (support) agents accessing their data.

  6. Value level: even if access is otherwise granted to a property, some values in it may need to be hidden, thus indicating that the property has less values than it does, perhaps none. For example, one may see certain enumerated interests, responsibilities, or authorization of other people but not others. This is also often the case with link properties referencing other controlled entities.

  7. Value detail level: Different parts of the complex or structured value may be accessible or not. For example, some users may only see the country, region and, perhaps, city of someone’s address whereas others may also see postal codes and detailed street addresses.

How far do you need to go? Are you sure? Are you going to accomplish this by implementing authorization within your service or are you going to shield it and secure it by some external wrapper? If so, I’d like you to think about how that wrapper will get all the data it needs to make its access control decision.

You may have spotted a mention of “link” properties. Even if you don’t feel the need for anything finer than entity instance level authorization, if you link to controlled entities you’re getting there, perhaps against your will. Suddenly you must hide individual values if they reference entities not meant to be visible. You skip levels (4) and (5) and go straight to (6) or even (7). This is where things get even interesting. When are you enforcing the link authorization? Is it OK to leak an id or an instance of the target entity if the very relation or the existence of the target entity are not meant to be known to the end user? Both are leaking data to some extent. Would you feel comfortable if other people get to know how many times you visited which kinds of medical doctors, even if they couldn’t see the details of those visits? Fun, isn’t it?

Action Specificity

Sometimes it’s not about the data being accessed but what is to be done with it. Take a property, say some for of “status” whose viewing and even modification are granted but not quite to any possible value. Some state transitions may require additional scrutiny and, thus, authorization. These may not depend only on the destination state. After all, if the sensitive state is already the current one switching to it isn’t a transition at all. In other words, authorizing specific transitions either requires the knowledge of both current and intended state or it is specifically authorizing the transition determined by some other code.

It doesn’t stop there. That was about modifying things. What about simple read-only actions? Suppose your detailed contact information is in the system. You submitted it yourself because you needed to be notified of things that are important to you, and you clearly negotiated that. You should be able to check that information to be able to keep it up to date. Those meant to send you those important notifications are meant to be able to access it as well. You opted out of promotional notifications, but it is the same people and systems doing all. Moreover, this data may travel through intermediate systems, perhaps even leave your continent, at least for a little while. Is that OK? Who or what is responsible for not abusing your contact information or routing it inappropriately? Who or what is to say “sure, using this for approved stuff is OK but I won’t allow marketing or data leaving Europe”? Will you:

  1. Rely on the goodness of the human hearts and ultimate reliability of human brains of everyone involved and their unwavering dedication to honor the stated agreements, never missing anything?

  2. Never trust anyone or anything, keep the data always to yourself and expose every separate “purpose” as an independently authorizable action.

  3. Form trust with some of the outside world that they will honor your contracts, including not passing the information to anyone or any system you don’t approve of, including unapproved proxies, loggers, or other unapproved services handled by the same orchestrator.

Which of the above can be used to defend your organization in the court of law? How much effort does each option require?

Some more fun

No, we’re not. There are cases when the user is authorized to both access a collection of data and perform actions, but the fine details interact it interesting ways. Imagine a document management system that, among other things, keeps track of the authors of those documents. This system is used by an institution with extreme security needs. Even if you are permitted to view a document, you may not be permitted to know about the existence of some employees – their identities must not only be hidden but you must not be able to even discover that some of them exist in the first place. Say that one of those secret people goes by the name “James Bond”.

Some of those secret people have participated in writing some of the documents you can enjoy. If you list the authors of those documents, the value level security we already spoke of can take care of excluding references to them – not listing them. Sounds fine so far. Now think of the case where a request is made to get one of the following instead of the list of authors:

  1. Derived data: the number of authors. If two out of five should be hidden from you the answer must be “3”. Note that this logic can only be done in place where the complete data exists. One cannot implement a wrapper that would turn a simple “5” into a “3” without having complete access to data and redoing the logic.

  2. Positive Criteria: documents with that id/name having at least one author named “James Bond”. Ideal system should not include the original document as that can allow fishing and leak out the existence of people it isn’t supposed to.

  3. Derivative Criteria:  documents with that id/name and more than three authors. The outcome is just like the previous case.

  4. Negative Criteria: documents not written by “James Bond”. Guess what. Since that person supposedly doesn’t exist, all documents should be treated as matches. This becomes active lying, not just filtering and cannot be addressed by the filtering proxies.

Those were merely read-only, simple cases. They can get more complex. They also don’t have to be read-only. Think about how the system should handle the following situations, unaware that some authors were excluded:

  1. Replacing the list of authors with another

  2. Adding new authors in order of how much they contributed

  3. Reordering the list of authors

Bonus: Asynchronicity and Broadcasts

Asynchronous processing plays an important role in software systems. I won’t go into describing it beyond saying that it is about deferred processing that occurs some nontrivial, possibly significant time after whatever triggers it. This brings a somewhat “philosophical” question: when is authorization due?

  1. When triggered only. You’re OK if processing executes despite removing access after trigger time.

  2. At processing time only. You’re OK giving the initiators possibly wrong impression.

  3. Both when triggered and at processing. You’re OK with changing your mind midway.

Note that some asynchronous processing is repetitive, not “single shot”. A form of this is exemplified in event/notification meant to keep the recipients informed about what is going on. That one also happens to add more processing parts as well. The first part culminates with sending the notification. The second is about the recipient(s) consuming it. Is that asynchronous with respect to sending the notification? Is it just a simple “poke” expecting a “pull”? “Hey! I have news, call me back to get them when you can.” Do you eliminate the need for the callback by including the details in that notification? If so, you must authorize the details to sent before sending them. How many recipients are there? Do you even know who they are? Are you in control? Do you only include the details that everyone in the list can see? Would that not give wrong data to those who could see more?

Instead of the conclusion

Welcome! I apologize for any headaches this may cause. I sympathise, as I’ve been through those. There is a light at the end of the tunnel but the tunnel itself, in my experience, is filled with half-baked works of art of those who ignored the warning signs on the way. Most REST APIs appear inflicted by the assumption that it is enough to authorize the HTTP method + endpoint combination, perhaps with filtering of resource entities but only at the top level. Very often there is little to no consideration given about the security of links, derived data, actions, and the criteria-driven phishing. Be ready to face that.


Previous
Previous

Security: Recognizing the Inappropriate

Next
Next

REST API (4): Emperor’s [New?] Clothes