Datinuum
Posts
Self-Fed is the New Self-Service

Self-Fed is the New Self-Service

Datinuum Newsletter - December 25th, 2023

Eric Gonzalez
December 25, 2023

Data Unfiltered

Self-Service is Dead

When I first began my career in data and heard the term self-service analytics, the concept made complete sense.

Users would be able to access, utilize, and build whatever is needed to take action on the data at an organization. We could shift from being a one-size-fits-all reporting or data solution to meeting each individual’s needs and wants.

As I have grown in my career, I have realized that the vision for self-service analytics and my beliefs in the concept were naive.

If you want to create a thriving self-service analytics environment, it requires:

All the data are vetted, validated, and from the same source.
Everyone agrees to use one tool or tech stack.
Users are willing to collaborate and share best practices.
The data platform is flexible and scalable for new data and users.
Users are data literate and self-sufficient with the enterprise tooling.

Unfortunately, few, if any, of those requirements are met.

All the data are vetted, validated, and from the same source

In a self-service environment, users are empowered to create reports or variations of reports to fit their needs.

If they are all pulled from the same data source, then all data are vetted and validated, and there ideally wouldn’t be any discrepancies between two user’s or department’s reports. Unfortunately, users tend to take the empowerment to new heights when they introduce new external data to “self-serve” with their existing data. The data may be helpful, but the newly introduced data haven’t been vetted or validated, making it likely that those reports will now stray from the ground truth of the enterprise data.

To prevent this, there needs to be strong data governance in place to prevent shadow IT from developing, where users acquire and use their data to fit their needs.

Everyone agrees to use one tool or tech stack

This seems like a simple requirement, but often, there can be 2, 3, or 5 different BI and reporting tools within a company (not including Excel).

The reasons may vary from:

Legacy tools are never transitioned to new platforms.
One-off use cases that “need” full integration with a unique BI tool.
Lack of clear data ownership enables the federated “owners” to choose a tool.

Regardless of the reason, having several BI and reporting tools creates a messy self-service environment.

What happens when a new person comes to the company? What tool do they use?
What if department A uses one tool and department B uses another, but a dashboard needs to be developed for data that corresponds with both departments?
Who is paying the licensing fee for 5 different tools? Are we missing out on potential cost savings by having more users in one environment?

If there is substantial data leadership at a company, that person will evaluate each tool and choose the one that fits the most needs to get everyone on the same tech stack.

Whether you want to go down the self-service path or not, trimming the number of vendors and centralizing your data and analytics will help all companies develop more effectively.

Users are willing to collaborate and share best practices

Ideally, this is happening at the company regardless of the department; however, with data, depending on the culture of an organization, people may gatekeep their progress.

As people innovate, they may “safeguard” their products because they don’t want someone else in the company “stealing” what they’ve built. Unfortunately, this type of attitude creates silos within a company and strife between leaders and departments. The silos, over time, will develop deeper fissures within the company, the data becoming segmented and disjointed as well.

The result is finger-pointing, shared blame, and, ultimately, a loss of trust in the data.

That loss of trust then manifests in new products being purchased for millions to remediate process issues and people gaps and can also result in people losing their jobs.

All leadership, notably the CEO, COO, and CDO, ensures that people collaborate in innovation and development and that everyone does what is best for the company.

The data platform is flexible and scalable for new data and users

When developing data solutions, you’re typically being reactive to what has happened and trying to create the best environment for that moment.

The problem is that solving previous issues doesn’t guarantee success for future problems.

The data platform must be flexible and scalable to:

Ingest new data
Handle schema drifts effectively
Onboard/offboard users with little friction

Suppose the platform is not proactive and cannot adjust over time. In that case, the data team will constantly be reacting to problems and putting out fires while the rest of the company will eventually stop using the platform that no longer serves their current needs.

Users are data literate and self-sufficient with the enterprise tooling

In self-service, “if you build it, they will come” is only valid if those who come can build, too.

Companies will invest millions into building a data platform, paying for expensive BI tools, and then no one outside the data team knows how to access data or build a report.

Organizations must invest in data literacy programs to educate everyone about the following:

The data of the business.
Data as they pertain to their role.
How competitors use data (in)effectively.

Once resources grasp those three concepts, they can receive Power BI, Tableau, Looker training, etc.

Suppose you skip straight to development in new tools. In that case, users will likely develop the same Excel reports in a more expensive and less intuitive platform instead of unlocking the new platform’s capabilities—and eventually switch back to Excel or what they’re comfortable with.

❝

To prevent this, there needs to be strong data governance in place to prevent shadow IT from developing where users acquire and use their own data to fit their needs.

Self-Fed is the New Self-Service

What I have seen over and over again with various clients and departments is that when they request self-service, what they genuinely want is self-fed analytics.

Instead of:

Going to a report.
Filtering on what they need to see.
Pulling that into a PowerPoint or sending a screenshot in an email.

Users would rather:

Develop canned reports automatically emailed on a specified cadence (daily, weekly, monthly).
Create thresholds with alerting or subscriptions.
Reach out to a central team with ad-hoc data requests.

In this scenario, the data team can serve as the organization's central control tower, enabling the canned reports and alerts while reducing ad-hoc requests by developing self-feeding analytics for users.

Self-fed is more valued as the users don’t have to spend hours trying to sift through and decipher the data and instead receive the data they desire when needed.

Data in the World

Apple Enters the LLM Chat

Apple quietly entered the LLM discussion in October with an open-sourced model called Ferret. Unsurprisingly, Apple joined late, though it is surprising that the project is open-sourced.

Apple quietly released an open source multimodal LLM in October

Apple is belatedly being celebrated on social media for releasing Ferret, an open source multimodal LLM, in October 2023.

venturebeat.com/ai/apple-quietly-released-an-open-source-multimodal-llm-in-october/#:~:text=With%20little%20fanfare%2C%20researchers%20from,did%20not%20receive%20much%20attention.

I expect Apple to work on more efficient architectures that are SLM-driven and can be run locally so that their entire hardware ecosystem—from Watches to MacBooks—can run LLMs smoothly.

Tech Hubs are Shifting

San Francisco, New York, and Boston used to be considered the primary regions where you could find tech talent. While they continue to be tech hubs, following the pandemic and the availability of remote work, new tech hubs are popping up all over the U.S.

Tech Hubs Are Losing the Talent War to Everywhere Else

As the tech industry grows up, talent and funding are spreading out. That’s bolstering smaller cities all across the country.

www.wsj.com/tech/personal-tech/smaller-tech-hubs-rise-of-the-rest-6c2b142f?mod=tech_lead_pos4

As people from the original tech hubs have relocated, cities like Denver, Austin, and Miami have seen considerable gains in tech talent. Witnessing this shift was a primary driver for me moving back home to Miami during the pandemic.

Data Career Tips

Give Yourself Permission

A mistake I made early in my career was waiting for others to give me the agency to do something or the permission to move to the next level.

The rationale and assumptions behind my choice (or lack thereof) was that other people:

Had been there before.
Knew what it takes to get to the next step.
Would let me know when it was my time and when I was ready.

The first two assumptions were spot on. People learned to level up from previous experiences, reading and learning, or working with mentors and colleagues.

The last assumption was my mistake.

It’s not that others don’t want you to succeed, but instead, they aren’t going to push you to the next step or promote you to that new role if you can’t see yourself in it.

Your manager won’t elevate you into a management role if she has to manage all your tasks to completion.
The executive team won’t see you as a candidate for the CDO role if you’ve never provided any insight on data strategies for the company.
Partners won’t want you to manage accounts if you consistently ask someone else to complete your MSAs and SOWs.

So, how do you break the cycle into that new role you want?

You don’t need to be arrogant, ungrateful, or impatient—your goal is the opposite. You must start taking on parts of the role you’re aiming for slowly.

If your manager hasn’t seen you manage anything before:

Start effectively managing all your deliverables.
Shift from reactive to proactive in your way of thinking.
Identify ways to delegate and take on more strategic roles on projects.
Learn how to manage up (arguably the most underrated skill at any level).

If the executive team hasn’t seen you in an executive capacity:

Identify ways to improve the existing yearly initiatives and roadmap.
Hold conversations with team members to understand pain points and strategize how to resolve those pain points with data.
Generate new ideas to help increase revenue or decrease costs with data-driven strategies.

If partners don’t want you to manage accounts:

Shadow and offer to complete others’ contracts to familiarize yourself with the process, and you can then execute it alone.
Source leads and develop strong relationships so you bring in sales and clients ask you to lead the account.

None of these suggestions will thrust you immediately into the situation you want because that takes time. Each of these, however, is an iterative step in moving you into that next position. Others will stop seeing you as the person you’ve always been and instead recognize your potential for what you can become.

Stop waiting for someone to anoint you as “ready” and give yourself permission.

Data Histories

Euclid and His Shapes

Euclid is most well-known for the mathematical formula that bears his name and is widely used in many applications today—the Euclidean distance.

Although the Euclidean distance is his most well-known contribution to the field, Euclid's impact on math went far beyond a straightforward formula.

Euclid was born around 300 BC in Greece, and some would argue that he is the father of geometry due to his book, Elements, which provided 5 postulates covering:

Lines
Polygons
Theorems
Right angles
Circles and radius
+ more on geometry as we know it today

The book also has the earliest written mention of the Pythagorean theorem, the precursor to the Euclidean distance, and his theorem behind the formula.

His notable formula, the Euclidean distance, measures the shortest distance between two points. However, the algorithm has limitations, given that it measures the shortest straight line and only works well with low-dimensional data.

Due to the limitations, the formula evolved and served as the basis for the following:

The Manhattan distance for map-based applications.
Cosine similarity in recommender systems.
KNN and other ML algorithms.

His work serves as the foundation for the geometry and math we use in school and is also a foundational component of the products we use daily, such as Netflix's recommender system or Google Maps.

Datinuumber of the Week: 167

Amazon projects to make $160 to $167 billion this holiday quarter, with the vast majority of consumers (74%) online shopping on Amazon (myself included).

Thank You

Thank you for subscribing and reading this week’s newsletter.

If you enjoyed the newsletter, the best way to help is by sharing it with colleagues and friends.

If you prefer to listen to the newsletter, the Datinuum Podletter will be released weekly on Apple, Spotify, or your favorite podcast player.

Datinuum Podletter

Listen to Datinuum Podletter on Spotify. The Datinuum Podletter is an audio version of the Datinuum Newsletter.

Feedback is a gift. Please reach out to [email protected] with any feedback or questions.