Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions for Wrapper and Convenient Methods (add protect/unprotect all) #1208

Open
PFython opened this issue Jun 6, 2023 · 10 comments · May be fixed by #1273
Open

Suggestions for Wrapper and Convenient Methods (add protect/unprotect all) #1208

PFython opened this issue Jun 6, 2023 · 10 comments · May be fixed by #1273
Assignees

Comments

@PFython
Copy link

PFython commented Jun 6, 2023

Is your feature request related to a problem? Please describe.
First of all, thanks for all the effort you've put into this package already - it's really helped me on my latest project.

I created the following general purpose 'wrapper' for quickly creating a Worksheet/Tab based on key, title, sheet name, sheet index, or account path. It also includes convenience methods for:

  • .view() : Opening a worksheet in your browser directly
  • .protect() : Protecting a whole sheet (except for the service account running the code)
  • .unprotect() : Unprotecting a whole sheet
  • .df : Converting directly to a Pandas DataFrame
  • pandas.DataFrame.to_gspread() : Saving a Pandas DataFrame (created using .df) back to its associated worksheet

I've also added convenience attributes to the worksheet object:

.email : the service account email (needs to be 'invited' to share a Google Sheet before having access)
.account_path : path to the service account JSON file

It wasn't immediately obvious to me where to add these in your structure otherwise I'd have created a pull request but the basic code is below if you think this would be a helpful utility to include?

Describe the solution you'd like

from pathlib import Path
import webbrowser

import gspread

def _view(self: gspread.worksheet.Worksheet):
    """Open this Google Sheet in the default web browser"""
    webbrowser.open(self.url)

def _protect(self):
    body = {
        "requests": [{
            "addProtectedRange": {"protectedRange": {
                "range": {"sheetId": self._properties['sheetId'],},
                "warningOnly": False,
                "description": f"LOCKED by: {self.email}",
                "editors": {
                        "domainUsersCanEdit": False,
                        "users": [self.email]
        },}}}]}
    try:
        return self.spreadsheet.batch_update(body)
    except gspread.exceptions.APIError as api:
        print(f"'{self._properties['title']}' is already protected")
        return api

def _unprotect(self):
    range_id = self.spreadsheet.list_protected_ranges(self.id).pop()
    range_id = range_id['protectedRangeId']
    body = {
        "requests":
            [{"deleteProtectedRange": {"protectedRangeId": range_id}}]
    }
    try:
        return self.spreadsheet.batch_update(body)
    except gspread.exceptions.APIError as api:
        print(f"'{self._properties['title']}' is already protected")
        return api

def _df(self):
    """
    Return a Pandas Dataframe from the specified Google Sheet object.

    Add attribute:
      pd.DataFrame.sheet : gspread.worksheet.Worksheet

    Add method:
        df.to_gspread()
    """
    global pd  # Just-in-time import, so as not to 'force' Pandas requirement
    import pandas as pd  
    df = pd.DataFrame(self.get_all_records(numericise_ignore=['all']))
    df.sheet = self
    setattr(pd.DataFrame, "to_gspread", _to_gspread)
    return df

def _to_gspread(self: pd.DataFrame, *args, **kwargs):
    """
    Save (ie. clear and update) the Pandas DataFrame to its parent Google Sheet
    (gspread.worksheet.Worksheet object) specified in self.sheet
    """
    sheet = self.sheet
    sheet.clear()
    sheet.update([self.columns.values.tolist()] + self.values.tolist())
    print(f"Saved: {sheet.spreadsheet.title}\n{sheet.url}")

def Worksheet(key_or_title=None, sheet_name="", sheet_index=0, account_path=""):
    """
    Return a Google Worksheet object (gspread.worksheet.Worksheet) from the
    specified workbook and worksheet, or the first worksheet if not specified.

    sheet_index uses normal Python counting ie. starts at 0.

    Added helper methods and shortcuts:

      gspread.worksheet.Worksheet.protect()
      gspread.worksheet.Worksheet.unprotect()
      gspread.worksheet.Worksheet.view()
      gspread.worksheet.Worksheet.df
      pandas.DataFrame.to_gspread()

      .account_path = account_path
      .email = client.auth._service_account_email

    """
    if key_or_title is None and "WORKBOOK_ID" in globals():
        key_or_title = WORKBOOK_ID
    if account_path == "" and "GOOGLE_SERVICE_ACCOUNT_JSON_PATH" in globals():
        account_path = GOOGLE_SERVICE_ACCOUNT_JSON_PATH
    try:
        account_path = account_path.resolve()
        client = gspread.service_account(filename=account_path)
    except FileNotFoundError:
        print(f"FileNotFoundError:\n{account_path}")
        return
    try:
        book = client.open(key_or_title)
    except gspread.exceptions.SpreadsheetNotFound:
        try:
            book = client.open_by_key(key_or_title)
        except gspread.exceptions.APIError:
            print(f"gspread.exceptions.APIError:\n  '{key_or_title}' not recognised as a key or title.")
            return
    try:
        if sheet_name:
            sheet = book.worksheet(sheet_name)
        else:
            sheet = book.worksheets()[sheet_index]
    except (gspread.exceptions.WorksheetNotFound, IndexError, NameError):
        key = f"'{sheet_name}'" if sheet_name else f"index {sheet_index}"
        print(f"gspread.exceptions.WorksheetNotFound:\n  sheet {key} not found.")
        sheet = None
    # Add helper methods and shortcuts
    setattr(gspread.worksheet.Worksheet, "view", _view)
    setattr(gspread.worksheet.Worksheet, "protect", _protect)
    setattr(gspread.worksheet.Worksheet, "unprotect", _unprotect)
    setattr(gspread.worksheet.Worksheet, 'df', property(_df))
    sheet.account_path = account_path
    sheet.email = client.auth._service_account_email
    return sheet

If you like (some or all of) this and want to incorporate it, I'd be happy submit relevant updates to the README doc.

@lavigne958
Copy link
Collaborator

Hi thank you for this generous proposal, we'll have a close look at it when we have time.

It does look interesting. As long as you needed to do these wrappers that means we are lacking some features.

We'll be back here with more details and we'll happy if you can provide a PR to introduce some of it.

@PFython
Copy link
Author

PFython commented Jun 6, 2023

Thanks - I was reluctant to submit a PR because I didn't know whether to overwrite/modify an existing class or just add it as an optional wrapper/utility. If you can direct me a little bit as to how/where to try grafting it on, I'll be happy to give it a try, but actually suspect you'll be able to do so quicker, knowing the codebase as you do :)

@alifeee
Copy link
Collaborator

alifeee commented Jun 6, 2023

Hi, great issue! Thanks for taking the time to make these suggestions.

I have some replies, some may be opinionated. I will give a small overview of each attribute/method you suggest.

For reference, I use book and Spreadsheet interchangeably, and also sheet and Worksheet the same.

.view()

Your code is already succinct. I would suggest using gspread.Spreadsheet.url to open the spreadsheet in a browser, with your own code. This means we do not have to add a dependency (webbrowser).

import webbrowser

def open_in_browser(book):
  webbrowser.open(book.url)

...

open_in_browser(book)

.protect() and .unprotect()

These sound like they would be good additions to gspread! I could eventually give it a go to implement them, but I implore you to give it a try:

For .protect(), I would:

  • add a method gspread.Worksheet.protect_all which:
    • calls gspread.Worksheet.add_protected_range with parameters:
      • gspread.Worksheet.row_count & gspread.Worksheet.col_count

For .unprotect(), I would:

  • add a method gspread.Worksheet.unprotect_all which:
    • uses gspread.Spreadsheet.list_protected_ranges to find all protected ranges
    • uses gspread.Worksheet.delete_protected_range to delete them all

Have a look at the Contributing Guide for how to get started with the repository (and tests!)

.df and pandas.DataFrame.to_gspread()

Personally, I would keep converting between these in your own code. In my opinion, there is no need for these to be internal methods of in either library. In the same way as view() above. Check the documentation for suggestion on how to convert to/from pandas.

.email and .account_path

These are both variables which are required for creating a gspread.Client in the first place. I do not see why they should belong to the Spreadsheet.

You can find the email in your credentials.json file, and you use the account_path in your code to create the gspread instance, so you can just use those if you need either variable.

Thanks for the issue! Hopefully you can give implementing protect/unprotect a go!

@PFython
Copy link
Author

PFython commented Jun 7, 2023

Many thanks for the constructive and friendly reply! Yep I'll definitely have a go now you've pointed me in the right direction - probably next week now :) Please feel free to assign this Issue to me!

I saw you've got a list_protected_ranges method so have updated my snippets above and will follow your suggestions for protect/unprotect shortcuts.

All the best!

@alifeee
Copy link
Collaborator

alifeee commented Jun 7, 2023

Super! Let us know if you need help with anything!

@alifeee
Copy link
Collaborator

alifeee commented Aug 14, 2023

@PFython We would still love your help on this, if you are interested.

If you no longer wish to, then I will make a PR adding protect/unprotect shortcuts in the coming weeks.

@PFython
Copy link
Author

PFython commented Aug 15, 2023

Thanks for your patience and gentle prod @alifeee!

I'm almost finished but got stuck because gspread.Worksheet.add_protected_range() requires an email. As far as I can tell from auth.py the Client attribute .auth._service_account_email is only created upon initialisation, so unless .email is added to the Spreadsheet or Worksheet (like my original, verbose, solution) I'm not sure how to recreate it?

Thanks in advance for any insight!
Pete

@alifeee alifeee changed the title Suggestions for Wrapper and Convenient Methods Suggestions for Wrapper and Convenient Methods (add protect/unprotect all) Aug 15, 2023
@alifeee
Copy link
Collaborator

alifeee commented Aug 15, 2023

Ah, I see the issue.

Did you get remove working?

If you are happy to, I'd suggest you submit a draft PR so we don't lose out on any code you wrote. Then, we can take a look at it with our knowledge of the codebase and see if we can think of a good solution. auth is more in @lavigne958's area of knowledge.

@PFython PFython linked a pull request Aug 15, 2023 that will close this issue
@PFython
Copy link
Author

PFython commented Aug 15, 2023

Thanks guys - I just submitted a draft PR: #1273

All the best,
Pete

@lavigne958 lavigne958 removed this from the 5.11.0 milestone Sep 1, 2023
@alifeee alifeee modified the milestones: 5.12, 6.0.0 Sep 28, 2023
@alifeee
Copy link
Collaborator

alifeee commented Oct 19, 2023

PR exists, but we are unsure how to get the email needed for the protected range. See #1273 (review)

@alifeee alifeee removed this from the 6.0.0 milestone Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants